Reduce CPU usage when idle #775

scottwey · 2024-09-15T01:29:56Z

Currently, the tight loop in Engine causes very high single core CPU usage when idle. This is also not great because this is long-running blocking code running inside of an async task, blocking an async worker entirely. On systems with lower core count, this will probably impact performance fairly negatively.

From my testing, this change dramatically drops CPU usage with minimal impact to performance, although I have not had a chance to benchmark properly.

github-actions · 2024-09-15T01:31:05Z

Code Metrics Report

  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                   12          105          104            0            1
 Python                 48         2136         1816           64          256
 TOML                   20          602          541            2           59
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          196          169            1           26
 (Total)                            273          201           32           40
-------------------------------------------------------------------------------
 Markdown               32         2323            0         1772          551
 |- BASH                 5          101           98            0            3
 |- JSON                 1           12           12            0            0
 |- Python               5           92           82            0           10
 |- Rust                 7          444          398           22           24
 |- TOML                 2           75           63            0           12
 (Total)                           3047          653         1794          600
-------------------------------------------------------------------------------
 Rust                  204        64134        58061         1283         4790
 |- Markdown           105          983           13          917           53
 (Total)                          65117        58074         2200         4843
===============================================================================
 Total                 327        69832        60963         3123         5746
===============================================================================

mistralrs-core/src/engine/mod.rs

scottwey · 2024-09-15T08:46:53Z

@EricLBuehler I switched to using yield_now only when nothing is scheduled instead of sleeping indiscriminately. I also did a bit of refactoring to hopefully make things a bit cleaner. I tested a bit and things are looking good. :)

EricLBuehler · 2024-09-16T01:47:11Z

mistralrs-core/src/engine/mod.rs


-            self.scheduler.free_finished_sequence_groups();
+                    if self.scheduler.waiting_len() == 0 {
+                        tokio::task::yield_now().await;


I like the idea of the tokio::select! addition! I'm just a bit confused about how this function would implement what we want (that is, to not sit in a loop). If I understand correctly, it yields to tokio's runtime, which means we wait for the other arm of the tokio::select! (the request recieve arm) to match?

Perhaps you could add a comment here explaining what the logic/flow is?

You understand my proposed flow correctly. My understanding of the reason we no longer sit in the loop anymore is mostly because of the yield_now; it basically marks the task as Pending for one iteration of Tokio's runtime. On the next iteration, yield_now will marked Ready and the loop can continue. Yielding doesn't take much time, but is enough to lower CPU usage dramatically.

@EricLBuehler Thoughts on this? Happy to close this and rethink if this change doesn't make sense to you.

EricLBuehler

@scottwey I think after this change and if you could do some testing of it, this should be good to merge!

Finally, if you could drop some rough metrics on CPU usage before vs after it would be great too.

EricLBuehler · 2024-09-18T22:58:28Z

mistralrs-core/src/engine/mod.rs

-            self.scheduler.free_finished_sequence_groups();
+                    // if there are no more pending requests in the scheduler, yield the current task
+                    // this will mark the task as `Pending` for one runtime tick, the loop will resume on the next tick
+                    if self.scheduler.waiting_len() == 0 {


Suggested change

if self.scheduler.waiting_len() == 0 {

if self.scheduler.waiting_len() == 0 && self.scheduler.running_len() == 0 {

see if sleeping for a little decreases cpu usage at rest

c447a0f

EricLBuehler reviewed Sep 15, 2024

View reviewed changes

mistralrs-core/src/engine/mod.rs Outdated Show resolved Hide resolved

scottwey added 4 commits September 15, 2024 08:29

attempt to refactor

38f84f9

Merge branch 'EricLBuehler:master' into master

8928c83

only yield when there's no waiting_len

c6f1845

Merge branch 'master' of github.com:scottwey/mistral.rs

c8c727a

EricLBuehler reviewed Sep 16, 2024

View reviewed changes

add a comment on what yield_now does

f9914fe

EricLBuehler requested changes Sep 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce CPU usage when idle #775

Reduce CPU usage when idle #775

scottwey commented Sep 15, 2024

github-actions bot commented Sep 15, 2024 •

edited

Loading

scottwey commented Sep 15, 2024 •

edited

Loading

EricLBuehler Sep 16, 2024

scottwey Sep 17, 2024 •

edited

Loading

scottwey Sep 18, 2024

EricLBuehler left a comment •

edited

Loading

EricLBuehler Sep 18, 2024

	if self.scheduler.waiting_len() == 0 {
	if self.scheduler.waiting_len() == 0 && self.scheduler.running_len() == 0 {

Reduce CPU usage when idle #775

Are you sure you want to change the base?

Reduce CPU usage when idle #775

Conversation

scottwey commented Sep 15, 2024

github-actions bot commented Sep 15, 2024 • edited Loading

scottwey commented Sep 15, 2024 • edited Loading

EricLBuehler Sep 16, 2024

Choose a reason for hiding this comment

scottwey Sep 17, 2024 • edited Loading

Choose a reason for hiding this comment

scottwey Sep 18, 2024

Choose a reason for hiding this comment

EricLBuehler left a comment • edited Loading

Choose a reason for hiding this comment

EricLBuehler Sep 18, 2024

Choose a reason for hiding this comment

github-actions bot commented Sep 15, 2024 •

edited

Loading

scottwey commented Sep 15, 2024 •

edited

Loading

scottwey Sep 17, 2024 •

edited

Loading

EricLBuehler left a comment •

edited

Loading