Skip to content
This repository has been archived by the owner on Jan 25, 2023. It is now read-only.

Roll out updates #35

Open
Xopherus opened this issue Oct 15, 2018 · 3 comments
Open

Roll out updates #35

Xopherus opened this issue Oct 15, 2018 · 3 comments

Comments

@Xopherus
Copy link

My team recently automated the roll out updates process. Essentially it was accomplished with two pieces:

  1. A lambda function which is triggered by a Lifecycle Hook when an auto-scaling group wants to terminate a node. This lambda function gets the instance which is being terminated and uses SSM to drain the nomad client of jobs. When it's completely drained, it completes the lifecycle hook to fully terminate the instance.

  2. A script which orchestrates the deployment. It takes advantage of the autoscaling API (instead of EC2) so that it triggers the lifecycle hook to safely drain the nomad client before termination. In our case we chose to scale-out first so that we always have N clients available, rather than N-1 if you scale-in.

I just wanted to open this issue to see if anyone else has feedback on this approach and whether or not you'd like this to be added to that module!

@brikis98
Copy link
Collaborator

A lambda function which is triggered by a Lifecycle Hook when an auto-scaling group wants to terminate a node.

This seems like a good approach. I've hit some issues in the past with Terraform + lifecycle hooks not firing when you'd expect them to, but it's possible those have been resolved.

This lambda function gets the instance which is being terminated and uses SSM to drain the nomad client of jobs

Why SSM? Why not just call nomad node-drain -address=<IP_OF_TERMINATING_INSTANCE> directly from the Lambda function?

A script which orchestrates the deployment. It takes advantage of the autoscaling API (instead of EC2) so that it triggers the lifecycle hook to safely drain the nomad client before termination. In our case we chose to scale-out first so that we always have N clients available, rather than N-1 if you scale-in.

Can you describe a bit more what the script is doing?

whether or not you'd like this to be added to that module!

Yes please!

@sarkis
Copy link

sarkis commented Feb 17, 2020

@Xopherus the automated roll out process sounds very useful for folks running Nomad in AWS. Are you still open to sharing the script which orchestrates the deployment? Specifically, I am curious if you have tied this back to the Nomad Terraform module somehow, or if it is completely separate.

@Xopherus
Copy link
Author

@sarkis yea definitely. I think what I have is probably a separate tool - internally we're adapting the overall approach to work with k8s, and will probably do the same with consul.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants