You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Custom metrics calls adds unscalable load on the master. The master is single threaded and it appears these calls are causing too much CPU pressure on the master for scaling up # of users for larger load tests.
Feature request:
Adjust the custom metrics implementation so that it doesn't add excessive load on the master at 200+ users.
The text was updated successfully, but these errors were encountered:
annapendleton
changed the title
ai-on-gke benchmark locust
ai-on-gke benchmark locust load inferencer hits 90%+ cpu usage with master at 200+ users
Aug 6, 2024
At 200+ users, master hits a CPU 90%+ error and becomes non-responsive.
Taking out the related custom metrics related calls in these lines fixes the issue:
https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/benchmarks/benchmark/tools/locust-load-inference/locust-docker/locust-tasks/tasks.py#L174
ai-on-gke/benchmarks/benchmark/tools/locust-load-inference/locust-docker/locust-tasks/tasks.py
Line 179 in 54531da
Observation:
Custom metrics calls adds unscalable load on the master. The master is single threaded and it appears these calls are causing too much CPU pressure on the master for scaling up # of users for larger load tests.
Feature request:
Adjust the custom metrics implementation so that it doesn't add excessive load on the master at 200+ users.
The text was updated successfully, but these errors were encountered: