-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scx_rustland: reduce scheduler overhead #56
scx_rustland: reduce scheduler overhead #56
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit messages are titled scx_userland
. Can you please update them?
@@ -489,7 +505,7 @@ void BPF_STRUCT_OPS(rustland_update_idle, s32 cpu, bool idle) | |||
* Moreover, kick the CPU to make it immediately ready to accept | |||
* dispatched tasks. | |||
*/ | |||
if (__sync_fetch_and_add(&nr_queued, 0)) { | |||
if (nr_queued || nr_scheduled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth explaining how this is interlocked with userland component and a situation where userland component is not invoked even while there are tasks to process can't happen. nr_queued is basically used as a boolean here indicating "something happened since userland looked at it, so better call them at least once". And then while the userland is running it updates its internal state through nr_scheduled. As long as update_idle is called after each userland run, this should be safe, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@htejun comment updated adding more details. Let me know if it seems clear enough. Thanks!
Edit: ...and changed scx_userland
-> scx_rustland
in the commit messages, thanks for noticing!
249b557
to
e79b5fa
Compare
Considering the CPU where the user-space scheduler is running as busy doesn't really provide any benefit, since the user-space scheduler is constantly dispatching an amount of tasks equal to the amount of idle CPUs and then yields (therefore its own CPU should be considered idle). Considering the CPU where the user-space scheduler is running as busy doesn't provide any benefit, as the scheduler consistently dispatches tasks equal to the number of idle CPUs and then yields (therefore its own CPU should be considered idle). This also allows to reduce the overall user-space scheduler CPU utilization, especially when the system is mostly idle, without introducing any measurable performance regression. Measuring the average CPU utilization of a (mostly) idle system over a time period of 60 sec: - wihout this patch: 5.41% avg cpu util - with this patch: 2.26% avg cpu util Signed-off-by: Andrea Righi <[email protected]>
We want to activate the user-space scheduler only when there are pending tasks that require scheduling actions. To do so we keep track of the queued tasks via nr_queued, that is incremented in .enqueue() when a task is sent to the user-space scheduler and decremented in .dispatch() when a task is dispatched. However, we may trigger an unbalance if the same pid is sent to the scheduler multiple times (because the scheduler store all the tasks by their unique pid). When this happens nr_queued is never decremented back to 0, leading the user-space scheduler to constantly spin, even if there's no activity to do. To prevent this from happening split nr_queued into nr_queued and nr_scheduled. The former will be updated by the BPF component every time that a task is sent to the scheduler and it's up to the user-space scheduler to reset the counter when the queue is fully dreained. The latter is maintained by the user-space scheduler and represents the amount of tasks that are still processed by the scheduler and are waiting to be dispatched. The sum of nr_queued + nr_scheduled will be called nr_waiting and we can rely on this metric to determine if the user-space scheduler has some pending work to do or not. This change makes rust_rustland more reliable and it strongly reduces the CPU usage of the user-space scheduler by eliminating a lot of unnecessary activations. Signed-off-by: Andrea Righi <[email protected]>
e79b5fa
to
e90bc92
Compare
These changes make scx_rustland more reliable and they strongly reduce the scheduler overhead.