Roll out updates #35

Xopherus · 2018-10-15T18:50:28Z

My team recently automated the roll out updates process. Essentially it was accomplished with two pieces:

A lambda function which is triggered by a Lifecycle Hook when an auto-scaling group wants to terminate a node. This lambda function gets the instance which is being terminated and uses SSM to drain the nomad client of jobs. When it's completely drained, it completes the lifecycle hook to fully terminate the instance.
A script which orchestrates the deployment. It takes advantage of the autoscaling API (instead of EC2) so that it triggers the lifecycle hook to safely drain the nomad client before termination. In our case we chose to scale-out first so that we always have N clients available, rather than N-1 if you scale-in.

I just wanted to open this issue to see if anyone else has feedback on this approach and whether or not you'd like this to be added to that module!

brikis98 · 2018-10-16T10:50:35Z

A lambda function which is triggered by a Lifecycle Hook when an auto-scaling group wants to terminate a node.

This seems like a good approach. I've hit some issues in the past with Terraform + lifecycle hooks not firing when you'd expect them to, but it's possible those have been resolved.

This lambda function gets the instance which is being terminated and uses SSM to drain the nomad client of jobs

Why SSM? Why not just call nomad node-drain -address=<IP_OF_TERMINATING_INSTANCE> directly from the Lambda function?

A script which orchestrates the deployment. It takes advantage of the autoscaling API (instead of EC2) so that it triggers the lifecycle hook to safely drain the nomad client before termination. In our case we chose to scale-out first so that we always have N clients available, rather than N-1 if you scale-in.

Can you describe a bit more what the script is doing?

whether or not you'd like this to be added to that module!

Yes please!

sarkis · 2020-02-17T16:09:04Z

@Xopherus the automated roll out process sounds very useful for folks running Nomad in AWS. Are you still open to sharing the script which orchestrates the deployment? Specifically, I am curious if you have tied this back to the Nomad Terraform module somehow, or if it is completely separate.

Xopherus · 2020-03-13T15:00:50Z

@sarkis yea definitely. I think what I have is probably a separate tool - internally we're adapting the overall approach to work with k8s, and will probably do the same with consul.

brikis98 added enhancement help wanted labels Oct 16, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roll out updates #35

Roll out updates #35

Xopherus commented Oct 15, 2018

brikis98 commented Oct 16, 2018

sarkis commented Feb 17, 2020

Xopherus commented Mar 13, 2020

Roll out updates #35

Roll out updates #35

Comments

Xopherus commented Oct 15, 2018

brikis98 commented Oct 16, 2018

sarkis commented Feb 17, 2020

Xopherus commented Mar 13, 2020