Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change celery pool support from prefork to threads #1437

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ run-celery: ## Run celery, TODO remove purge for staging/prod
-A run_celery.notify_celery worker \
--pidfile="/tmp/celery.pid" \
--loglevel=INFO \
--concurrency=4
--pool=threads
--concurrency=10


.PHONY: dead-code
Expand Down
23 changes: 23 additions & 0 deletions docs/adrs/0010-adr-celery-pool-support-best-practice.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Make best use of celery worker pools

Status: N/A
Date: N/A

### Context
Our API application started with initial celery pool support of 'prefork' (the default) and concurrency of 4. We continuously encountered instability, which we initially attributed to a resource leak. As a result of this we added the configuration `worker-max-tasks-per-child=500` which is a best practice. When we ran a load test of 25000 simulated messages, however, we continued to see stability issues, amounting to a crash of the app after 4 hours requiring a restage. Based on running `cf app notify-api-production` and observing that `cpu entitlement` was off the charts at 10000% to 12000% for the works, and after doing some further reading, we came to the conclusion that perhaps `prefork` pool support is not the best type of pool support for the API application.

The problem with `prefork` is that each process has a tendency to hang onto the CPU allocated to it, even if it is not being used. Our application is not computationally intensive and largely consists of downloading strings from S3, parsing the strings, and sending them out as SMS messages. Based on the determination that our app is likely I/O bound, we elected to do an experiment where we changed pool support to `threads` and increased concurrency to `10`. The expectation is that memory usage will decrease and CPU usage will decrease and the app will not become unavailable.

### Decision

### Consequences

### Author
@kenkehl

### Stakeholders
@ccostino
@stvnrlly

### Next Steps
- Run an after-hours load test with production configured to --pool=threads and --concurrency=10 (concurrency can be cautiously increased once we know it works)
2 changes: 1 addition & 1 deletion manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ applications:
- type: worker
instances: ((worker_instances))
memory: ((worker_memory))
command: newrelic-admin run-program celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=4
command: newrelic-admin run-program celery -A run_celery.notify_celery worker --loglevel=INFO --pool=threads --concurrency=10
- type: scheduler
instances: 1
memory: ((scheduler_memory))
Expand Down
Loading