-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keep scaling when nodes are draining #672
Comments
I was thinking about something like this: #679 |
Hi @janory!
I think the major challenge here is that the nodes might be draining for reasons outside the control of the autoscaler. Maybe you've run If we do decide to ignore this check, then we need to adjust our expectations of what plugins return as node count. For example, if there are 5 instances in a ASG, but 2 are draining, maybe the policy calculation should only count 3 nodes to account for either of those two situations? |
Hi @tgross , we've run into this issue/constraint as well. After moving to AWS spot instances which can receive interruption notices at any moment (and nearly continuously if you've got a large enough mixed cluster), our autoscaler would stop scaling for up to a half hour at a time (due to any node in the cluster being draining/initializing/other-than-ready) and we'd totally blow our SLA. For us, the bigger sin than not scaling exactly is to not scale quickly. We don't mind underestimating capacity so we've customized the aws_asg and nomad_apm plugins so It might be nice to be able to provide a |
Thank you for extra input @douglaje. I've experimented with bypassing these checks but I'm still unsure about their impact. The biggest blocker here is that a policy is not allowed to be evaluated in parallel, meaning that only a single scaling action is allowed happen at time. But if you have multiple policies targeting the same set of nodes, or if the scaling action takes so long that evaluation times out, then this can be bypassed as well. I've opened #811 to start some discussion around this. As I mentioned, I'm still unsure about it, so I'm at least marking these new configuration as experimental and we will probably not document them for now. If you would be willing to try them we could perhaps consider merging it. For reference, this is the policy file I used for test. I had split the scaling up and down into two different policies so the actions could, in theory, happen at the same time. Another thing that is important about the AWS ASG target plugin is that the ASG events also affect their cooldown, so you also need different values there. scaling "cluster_up" {
enabled = true
min = 1
max = 4
policy {
cooldown = "3s"
evaluation_interval = "10s"
check "up" {
source = "prometheus"
query = "sum(nomad_client_allocations_running)/count(nomad_client_allocations_running)"
strategy "threshold" {
lower_bound = 3.9
delta = 1
}
}
target "aws-asg" {
dry-run = "false"
aws_asg_name = "hashistack-nomad_client"
node_class = "hashistack"
node_drain_deadline = "10m"
# EXPERIMENTAL.
node_filter_ignore_drain = true
ignore_asg_events = true
}
}
}
scaling "cluster_down" {
enabled = true
min = 1
max = 4
policy {
cooldown = "10s"
evaluation_interval = "10s"
check "down" {
source = "prometheus"
query = "sum(nomad_client_allocations_running)/count(nomad_client_allocations_running)"
strategy "threshold" {
upper_bound = 3.1
delta = -1
}
}
target "aws-asg" {
dry-run = "false"
aws_asg_name = "hashistack-nomad_client"
node_class = "hashistack"
node_drain_deadline = "10m"
# EXPERIMENTAL.
node_filter_ignore_drain = true
ignore_asg_events = true
}
}
} |
Hi! 👋
We recently started to use the Nomad Autoscaler agent and we really like it. 🚀
We are using the Autoscaler with the
Nomad APM
,aws-asg
target andtarget-value
strategy plugins.We have multiple long running (1-45 minutes) batch jobs on our nodes and when a scale in action happens the drain event won't finish until the last batch job completes on the node.
This leads to constant warning messages like this:
because the Autoscaler implicitly checks the ASG target's status for each tick (
handleTick -> generateEvaluation -> Status -> IsPoolReady -> FilterNodes -> if node.Drain
).Based on the comment here and also based on what we are experiencing the Autoscaler stops any further scaling actions until all draining activities are completed.
This is an issue for us, because in worst case scenario the long running batch jobs will prevent us scaling for 45 minutes.
Would it be possible to add a config for the
idFn
function to filter out draining nodes and keep scaling?We would also like to better understand what are the risks of scaling a cluster which has draining nodes and why such a cluster is considered unstable.
The text was updated successfully, but these errors were encountered: