You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 25, 2023. It is now read-only.
My team recently automated the roll out updates process. Essentially it was accomplished with two pieces:
A lambda function which is triggered by a Lifecycle Hook when an auto-scaling group wants to terminate a node. This lambda function gets the instance which is being terminated and uses SSM to drain the nomad client of jobs. When it's completely drained, it completes the lifecycle hook to fully terminate the instance.
A script which orchestrates the deployment. It takes advantage of the autoscaling API (instead of EC2) so that it triggers the lifecycle hook to safely drain the nomad client before termination. In our case we chose to scale-out first so that we always have N clients available, rather than N-1 if you scale-in.
I just wanted to open this issue to see if anyone else has feedback on this approach and whether or not you'd like this to be added to that module!
The text was updated successfully, but these errors were encountered:
A lambda function which is triggered by a Lifecycle Hook when an auto-scaling group wants to terminate a node.
This seems like a good approach. I've hit some issues in the past with Terraform + lifecycle hooks not firing when you'd expect them to, but it's possible those have been resolved.
This lambda function gets the instance which is being terminated and uses SSM to drain the nomad client of jobs
Why SSM? Why not just call nomad node-drain -address=<IP_OF_TERMINATING_INSTANCE> directly from the Lambda function?
A script which orchestrates the deployment. It takes advantage of the autoscaling API (instead of EC2) so that it triggers the lifecycle hook to safely drain the nomad client before termination. In our case we chose to scale-out first so that we always have N clients available, rather than N-1 if you scale-in.
Can you describe a bit more what the script is doing?
whether or not you'd like this to be added to that module!
@Xopherus the automated roll out process sounds very useful for folks running Nomad in AWS. Are you still open to sharing the script which orchestrates the deployment? Specifically, I am curious if you have tied this back to the Nomad Terraform module somehow, or if it is completely separate.
@sarkis yea definitely. I think what I have is probably a separate tool - internally we're adapting the overall approach to work with k8s, and will probably do the same with consul.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
My team recently automated the roll out updates process. Essentially it was accomplished with two pieces:
A lambda function which is triggered by a Lifecycle Hook when an auto-scaling group wants to terminate a node. This lambda function gets the instance which is being terminated and uses SSM to drain the nomad client of jobs. When it's completely drained, it completes the lifecycle hook to fully terminate the instance.
A script which orchestrates the deployment. It takes advantage of the autoscaling API (instead of EC2) so that it triggers the lifecycle hook to safely drain the nomad client before termination. In our case we chose to scale-out first so that we always have N clients available, rather than N-1 if you scale-in.
I just wanted to open this issue to see if anyone else has feedback on this approach and whether or not you'd like this to be added to that module!
The text was updated successfully, but these errors were encountered: