-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce CPU usage when idle #775
base: master
Are you sure you want to change the base?
Conversation
Code Metrics Report=============================================================================== Language Files Lines Code Comments Blanks =============================================================================== C Header 2 35 28 0 7 Dockerfile 1 34 25 0 9 Happy 1 442 369 0 73 JSON 12 105 104 0 1 Python 48 2136 1816 64 256 TOML 20 602 541 2 59 YAML 2 21 19 2 0 ------------------------------------------------------------------------------- Jupyter Notebooks 4 0 0 0 0 |- Markdown 2 77 32 31 14 |- Python 2 196 169 1 26 (Total) 273 201 32 40 ------------------------------------------------------------------------------- Markdown 32 2323 0 1772 551 |- BASH 5 101 98 0 3 |- JSON 1 12 12 0 0 |- Python 5 92 82 0 10 |- Rust 7 444 398 22 24 |- TOML 2 75 63 0 12 (Total) 3047 653 1794 600 ------------------------------------------------------------------------------- Rust 204 64134 58061 1283 4790 |- Markdown 105 983 13 917 53 (Total) 65117 58074 2200 4843 =============================================================================== Total 327 69832 60963 3123 5746 =============================================================================== |
@EricLBuehler I switched to using |
|
||
self.scheduler.free_finished_sequence_groups(); | ||
if self.scheduler.waiting_len() == 0 { | ||
tokio::task::yield_now().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of the tokio::select!
addition! I'm just a bit confused about how this function would implement what we want (that is, to not sit in a loop). If I understand correctly, it yields to tokio's runtime, which means we wait for the other arm of the tokio::select!
(the request recieve arm) to match?
Perhaps you could add a comment here explaining what the logic/flow is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You understand my proposed flow correctly. My understanding of the reason we no longer sit in the loop anymore is mostly because of the yield_now
; it basically marks the task as Pending
for one iteration of Tokio's runtime. On the next iteration, yield_now
will marked Ready
and the loop can continue. Yielding doesn't take much time, but is enough to lower CPU usage dramatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EricLBuehler Thoughts on this? Happy to close this and rethink if this change doesn't make sense to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scottwey I think after this change and if you could do some testing of it, this should be good to merge!
Finally, if you could drop some rough metrics on CPU usage before vs after it would be great too.
self.scheduler.free_finished_sequence_groups(); | ||
// if there are no more pending requests in the scheduler, yield the current task | ||
// this will mark the task as `Pending` for one runtime tick, the loop will resume on the next tick | ||
if self.scheduler.waiting_len() == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.scheduler.waiting_len() == 0 { | |
if self.scheduler.waiting_len() == 0 && self.scheduler.running_len() == 0 { |
Currently, the tight loop in
Engine
causes very high single core CPU usage when idle. This is also not great because this is long-running blocking code running inside of an async task, blocking an async worker entirely. On systems with lower core count, this will probably impact performance fairly negatively.From my testing, this change dramatically drops CPU usage with minimal impact to performance, although I have not had a chance to benchmark properly.