Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Workaround] Add max requests to gunicorn #443

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nuwang
Copy link
Member

@nuwang nuwang commented Sep 13, 2023

From @mvdbeek on: https://matrix.to/#/!rfLDbcWEWZapZrujix:gitter.im/$f-E-Kl-Cumt4tPAajzrka83yHLSrmVM7IbDkTQjwDQA?via=gitter.im&via=matrix.org

we're using https://github.com/galaxyproject/usegalaxy-playbook/blob/5be97742d688a5b65762bd13064a9484f77f6671/env/main/group_vars/galaxyjobservers.yml#L44C11-L44C11 for that reason. I spent a good amount of time looking at this and could confirm a memory leak in the asyncio stack when there are (temporarily) a lot of concurrent connections. The memory allocated for these connections is never released and there's not much we can do, except for switching to hypercorn and trio instead of gunicorn + uvicorn + asyncio
that said 23.1 reduced steady-state memory consumption by about 2 fold. it doesn't change anything about the growing memory consumption with concurrent requests, but I thought I'd mention this
image
the big drop in memory consumption is when we added the relevant patches to 23.1 and added --max-requests
the individual web handler processes now rarely ever go beyond 1.2GB RSS

@nuwang nuwang requested a review from mvdbeek September 13, 2023 19:47
@mvdbeek
Copy link
Member

mvdbeek commented Sep 14, 2023

You may want to check how far apart you need to place the restarts and I'd assume this works better with more than 1 web worker, since there is a period where no worker is available while a new worker is being forked.

I'd also only do this if you're actually seeing a leaky pattern, I would assume you'll only see this at the usegalaxy.* scale.

@nuwang
Copy link
Member Author

nuwang commented Sep 14, 2023

Thanks for the review.

You may want to check how far apart you need to place the restarts and I'd assume this works better with more than 1 web worker, since there is a period where no worker is available while a new worker is being forked.

It's just one worker by default to support smaller installations. We should probably introduce HorizontalPodAutoscaling at some future date.

I'd also only do this if you're actually seeing a leaky pattern, I would assume you'll only see this at the usegalaxy.* scale.

My assumption here was that these params would do no harm, and since I spotted your fix, that we might as well future proof things now. Sound ok?

@mvdbeek
Copy link
Member

mvdbeek commented Sep 14, 2023

If we remember to take this hack out in the future that's fine, sure.

@nuwang nuwang marked this pull request as draft September 14, 2023 12:55
@nuwang
Copy link
Member Author

nuwang commented Sep 14, 2023

In that case, I'll just leave this in draft status for future reference. If someone needs it, it can always be merged, and in the meantime, we can hope that things just get fixed upstream.

@nuwang nuwang added the on-hold Waiting for more information label Feb 14, 2024
@nuwang nuwang changed the title Add max requests to gunicorn [Workaround] Add max requests to gunicorn Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
on-hold Waiting for more information
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants