Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoscaling #749

Open
ryanemerson opened this issue Jan 18, 2021 · 25 comments
Open

Autoscaling #749

ryanemerson opened this issue Jan 18, 2021 · 25 comments
Labels
discussion Product architecture and enhancements discussion enhancement New feature or request

Comments

@ryanemerson
Copy link
Contributor

#691 deprecates the Cache Service in favour of providing the DataGrid service as the default and removing this configuration option.

Currently only the cache-service provides memory-based autoscaling, however it relies on assumptions about the cache's storage and replication type to determine when pods schould be scaled. This approach is not possible with the DataGrid service as users are able to use arbritrary cache configurations. Instead, we should introduce "container level" autoscaling where the number of pods increases and decreases based upon the memory usage of the entire container exceeding the configured upper or lower bound of memory usage percentage respectively.

@ryanemerson ryanemerson added enhancement New feature or request discussion Product architecture and enhancements discussion labels Jan 18, 2021
@dmvolod
Copy link
Member

dmvolod commented Jan 18, 2021

Can we use HPA for DataGrid case?
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

@rigazilla
Copy link
Collaborator

Last time I tried to use the default HPA memory metrics I didn't get good results, mainly because of the java memory management.
Maybe we can tune GC or investigate if HPA can be integrated with custom memory metrics.

@dmvolod
Copy link
Member

dmvolod commented Jan 25, 2021

Yeah, HPA custom metrics should support, but need to be validated
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/custom-metrics-api.md

@rigazilla
Copy link
Collaborator

A good starting point to design HPA/operator integration could be CPU autoscaling (#274): standard CPU load metric should work quite well to control pods scaling.

@ryanemerson
Copy link
Contributor Author

ryanemerson commented Nov 11, 2021

There's also Kubernetes Event-driven Autoscaling that has integrations with PostgreSQL and Redis.

@ryanemerson ryanemerson changed the title Container Autoscaling Autoscaling Feb 24, 2022
@ryanemerson
Copy link
Contributor Author

ryanemerson commented Feb 24, 2022

A problem with a generic container wide approach is that it does not take into account the requirements of different cache types. Replicated and Distributed caches have very different requirements when it comes to autoscaling, therefore it's necessary for any autoscaling to be configured based upon the use-cases of the Infinispan cluster.

Cache Scaling Semantics

Here we define how different cache types affect scaling.

Replicated Cache

Vertical Scaling Horizontal Scaling
CPU Allows increased read and write performance Allows increased read performance, but results in slower writes as each additional pod needs to be included in every write operation
Memory Increases capacity for all pods Doesn't make sense as all pods store all entries, so increasing the number of pods does not increase the total memory available

Distributed Cache

Vertical Scaling Horizontal Scaling
CPU Allows increased read and write performance Does not improve CPU performance as entries are always read from entry primary or backup owners
Memory Increases memory capacity of the cluster Increases memory capacity of the cluster

Proposal

Implement automatic Horizontal Scaling and require users/admins to manually perform Vertical scaling by updating the Infinispan spec.container fields.

Automatically scaling an existing cluster vertically is tricky as it can lead to the cluster becoming unavailable due to a lack of resources. Furthermore, K8s does not provide a mechanism to vertically scale OOB.

Correct autoscaling behaviour is tightly coupled to an application's Infinispan requirements and cannot be implemented in a way that is applicable to all users. This proposal is concerned with how we can expose autoscaling configuration to the user so that they can define behaviour suitable for their use-case. A big part of this effort will be creating documentation that details what type of scaling is appropriate for different workloads.

Implementation

Based upon the HorizontalPodAutoscaler.

We extend the Infinispan CRD to define the scale subresource.

The HorizontalPodAutoscaler controller will then increase/decrease the Infinispan CR spec.replicas field based upon the behaviour defined in the HorizontalPodAutoscaler CR.

Utilising the autoscaling/v2beta2 api can define fine grained control of the scale up/down behaviour. For example, utilising a stabilizationWindowSeconds to prevent excessive scaling resulting in rebalancing adversely affecting performance.

Below is an example HorizontalPodAutoscaler definition with a custom scaleUp definition.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-infinispan
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: v1 
    kind: Infinispan 
    name: example-infinispan
  minReplicas: 1 
  maxReplicas: 10 
  metrics: 
  - type: Resource
    resource:
      name: memory 
      target:
        type: AverageValue 
        averageValue: 500Mi
  behavior: 
    scaleUp:
      stabilizationWindowSeconds: 180
      policies:
      - type: Pods
        value: 1
        periodSeconds: 120
      selectPolicy: Max

User Configuration

The user can define scaling in one of three ways:

  1. Infinispan CR spec
    • User defines the high-level scaling behaviour required
    • Operator automatically creates the HorizontalPodAutoscaler
    • Requires a new Infinispan CR version
spec:
  autoscale:
    minReplicas: 1
    maxReplicas: 10
    resource:
      - name: cpu
        type: AverageValue | Utilization | Value
        # One of the below fields must be defined depending on the configure type
        averageValue: 500Mi
        value: 500Mi
        averageUtilisation: 50%
      - name: memory
        type: AverageValue | Utilization | Value
        # One of the below fields must be defined depending on the configure type
        averageValue: 500m
        value: 500m
        averageUtilisation: 50%
  1. kubectl

    • kubectl autoscale infinispan example-infinispan --cpu-percent=50 --min=1 --max=10
  2. Manually create HorizontalPodAutoscaler

    • Allows for more advanced configurations where the operator defaults are not appropriate

@rigazilla
Copy link
Collaborator

@ryanemerson, overall I like the approach.
I've still some concerns about the metrics though:
while I consider the default CPU metric good enough,
for memory I would suggest as a first step to verify if my previous comment is still true. I mean, without a good metric it's hard to control a system.
I that case we could try to tune the GC, as described here or a more complex solution could be to instrument Infinispan with an ad-hoc metric.

@ryanemerson
Copy link
Contributor Author

ryanemerson commented Feb 28, 2022

for memory I would suggest as a first step to verify if my previous comment is still true. I mean, without a good metric it's hard to control a system.

Can you elaborate on the issues you encountered?

I'm guessing it was the JVM not releasing committed memory once it's unused?

I that case we could try to tune the GC, as described here or a more complex solution could be to instrument Infinispan with an ad-hoc metric.

I think this is an area where we would benefit from using Shenandoah

https://stackoverflow.com/questions/61506136/kubernetes-pod-memory-java-gc-logs/61512521#61512521

@rigazilla
Copy link
Collaborator

rigazilla commented Feb 28, 2022

Can you elaborate on the issues you encountered?

I'm guessing it was the JVM not releasing committed memory once it's unused?

  1. Yep that is probably the main one
  2. iirc: another problematic aspect is how the pods are scaled up: kubernetes doesn't start new pods one by one, instead it applies a multiplier factor https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details. This doesn't fit very well with java applications with consistent initial memory foot print. (btw I consider very aggressive this "multiplicative algorithm", maybe I'm missing something)
  3. (minor) there's a "minimum number of nodes" in the ispn metrics below which the cluster starts to loose data. I'm not sure this can be handled via standard autoscaler

@ryanemerson
Copy link
Contributor Author

2. iirc: another problematic aspect is how the pods are scaled up: kubernetes doesn't start new pods one by one, instead it applies a multiplier factor https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details. This doesn't fit very well with java applications with consistent initial memory foot print. (btw I consider very aggressive this "multiplicative algorithm", maybe I'm missing something)

We can control this with the autoscaling/v2beta2 API, as it lets us control the scale up/down behaviour.

3. (minor) there's a "minimum number of nodes" in the ispn metrics below which the cluster starts to loose data. I'm not sure this can be handled via standard autoscaler

We could make the scale/up down behaviour be dictated by Infinispan itself using a custom metric that indicates when more/less memory is required, with the metric taking into account a lower bound to ensure that the cluster maintains at least the minimum number of pods.

Exposing a custom metric is more involved than using the basic memory usage and would require an enhancement on the server side. We could start with a basic memory based approach and then enhance the auto scale feature in the future as required.

Here is a quick guide on how to use custom metrics with HPA.

@rigazilla
Copy link
Collaborator

We could start with a basic memory based approach and then enhance the auto scale feature in the future as required.

Sounds good, though I would suggest to verify early how far can we go with basic metrics, imo the choice between basic vs ad-hoc metric could have broad impact (possibly even on features design?)

@rigazilla
Copy link
Collaborator

2. kubectl
   
   * `kubectl autoscale infinispan example-infinispan --cpu-percent=50 --min=1 --max=10`

3. Manually create `HorizontalPodAutoscaler`
   
   * Allows for more advanced configurations where the operator defaults are not appropriate

Just realized that these 2 options could require some attention, I mean both the operator and the scaler would act on the statefulSet.replicas field.

@ryanemerson
Copy link
Contributor Author

Just realized that these 2 options could require some attention, I mean both the operator and the scaler would act on the statefulSet.replicas field.

My understanding is that implementing the scale subresource is all that's required so that HPA modifies the Infinispan spec.replicas field.

@ZeidH
Copy link
Contributor

ZeidH commented Jul 24, 2024

Hi,
We had a similar issue with the ActiveMQ Artemis Operator.
It didn't implement the scale subresource like you mention is required. Here's a reference to their enhancement:
[#124] support scale subresource for scale to zero

Our use case is that we want to use KEDA Operator's ScaledObjects to control the replicas.

Is there any way we can do the same with the Infinispan Operator? Like @rigazilla mentioned, currently the operator is conflicting if we attach the ScaledObject to the StatefulSet.

@ryanemerson
Copy link
Contributor Author

Is there any way we can do the same with the Infinispan Operator?

Not at present. The Infinispan CRD needs enhancing so that we expose the scale subresource for this to be possible.

@ZeidH
Copy link
Contributor

ZeidH commented Jul 29, 2024

I've added the fields to the Infinispan CRD and it seems to work with KEDA. I'll continue testing it for our use case this week to see if I encounter any bugs.

Are there are any use cases that I should be aware of with this change, @ryanemerson?
#2133

@rigazilla
Copy link
Collaborator

Hi @ZeidH
sometimes the operator needs to explicitly set the .spec.Replicas, i.e. in upgrade. Does this generate a contention with KEDA management?

@ZeidH
Copy link
Contributor

ZeidH commented Jul 31, 2024

Hi @ZeidH sometimes the operator needs to explicitly set the .spec.Replicas, i.e. in upgrade. Does this generate a contention with KEDA management?

Good observation, I think that will. There are ways to Pause KEDA ScaledObjects by adding the autoscaling.keda.sh/paused: "true" annotation to the ScaledObject.

So we could extend upgrades.go to add that annotation on all ScaledObjects that have scaleTargetRef on Infinispan. But not sure if we'd want to create that dependency on this project.

@rigazilla
Copy link
Collaborator

But not sure if we'd want to create that dependency on this project.

yeah
maybe we could remove the scale subresource during upgrade

@ryanemerson
Copy link
Contributor Author

ryanemerson commented Aug 1, 2024

Infinispan server upgrades should never be automatic now. Once the initial Infinispan CR version has been created it's necessary for the user to manually update the spec.version field in order to trigger an upgrade. So I think we can workaround this issue by documenting that if Keda autoscaling is used, then it's necessary for the autoscaling.keda.sh/paused: "true" to be applied to the ScaledObject before upgrade is triggered.

@rigazilla
Copy link
Collaborator

if we want a specific solution for Keda, then it's fine

@ZeidH
Copy link
Contributor

ZeidH commented Aug 9, 2024

For us documenting it is sufficient. I've updated the PR and included how I tested it, just waiting on this: #2134. Let me know if there's anything else I could do

@ryanemerson
Copy link
Contributor Author

Thanks @ZeidH, apologies for the delay. We've had to focus our efforts elsewhere recently, but hopefully I'll get a chance to look into the CI failures on #2134 shortly.

@ZeidH
Copy link
Contributor

ZeidH commented Aug 12, 2024

No worries @ryanemerson, I've no rush in this so take your time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Product architecture and enhancements discussion enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants