You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have some services that consume a limited external resource. There is a metric on how much of that resource is remaining. I would like to be able to write a scaling policy that both scales on service cpu and prevents scaling up beyond the capacity remaining. However, my understanding is that the scale up decision is an or of all checks and uses the max value suggested.
I think the only way to accomplish this goal currently is to set the max count. This is fragile as a change to the external resource needs to also be reflected in the service's max count. It's also more complicated if multiple services all consume the same resource.
Is there some way to accomplish this that I'm missing?
The text was updated successfully, but these errors were encountered:
jaltavilla
changed the title
Feature Request: Scale up using an and of checks
Question: Scale up only if two checks are true
Dec 1, 2023
Since you have a metric for your resource you may be able to adjust the query instead. For example, with Prometheus you could use the clamp_max function to limit the value of another query result:
We are using the target strategy, and I sadly I couldn't quite figure out how to make it work with it. I suspect that if we were using the threshold strategy it would be easier. My understanding of the target strategy is that current_resource_metric_value can't return an arbitrary limit. It needs to be in the range [0, target] when the resource is exhausted to allow it to scale down only when actual_query_you_want_to_scale indicates it could. Similarly the range has to be [0, large number>target] when the resource isn't exhausted.
We already clamp_max our query to 200 to limit scaling up on spikes. So we need a function that returns [0, target] or [0, 200] based on whether resources are exhausted. I could transform that into needing an equation that returns 0 or 1 with query = clamp_max(actual_query_you_want_to_scale, target * one_if_exhausted + 200 * one_if_not_exhausted)
But, I couldn't figure out how to make an equation that returned 0 or 1 with the datadog functions available, so I ended up auditing our services and fixing their max counts.
We have some services that consume a limited external resource. There is a metric on how much of that resource is remaining. I would like to be able to write a scaling policy that both scales on service cpu and prevents scaling up beyond the capacity remaining. However, my understanding is that the scale up decision is an
or
of all checks and uses the max value suggested.I think the only way to accomplish this goal currently is to set the max count. This is fragile as a change to the external resource needs to also be reflected in the service's max count. It's also more complicated if multiple services all consume the same resource.
Is there some way to accomplish this that I'm missing?
The text was updated successfully, but these errors were encountered: