Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(rust): Add tracking of async task wait time statistics #19373

Merged
merged 1 commit into from
Oct 22, 2024

Conversation

orlp
Copy link
Collaborator

@orlp orlp commented Oct 22, 2024

A profiler is useless in an async engine if something is a blocking bottleneck - all threads will just show spending a lot of time in condvar waiting inside the executor loop.

With this PR if you set POLARS_TRACK_WAIT_STATS=1 we track the time between right before we park a thread, and when that thread finds a new task, and adds the time spent blocked to that tasks metadata. After a new streaming query we print these stats, which can look like this:

Time spent waiting for async tasks:
crates/polars-stream/src/nodes/parquet_source/mod.rs:230 - 971.347764ms
crates/polars-stream/src/nodes/parquet_source/init.rs:179 - 493.140887ms
crates/polars-stream/src/nodes/in_memory_source.rs:75 - 16.806251ms
crates/polars-stream/src/nodes/in_memory_sink.rs:58 - 4.832292ms
crates/polars-stream/src/nodes/parquet_source/metadata_fetch.rs:123 - 0ns
crates/polars-stream/src/nodes/group_by.rs:47 - 0ns
crates/polars-stream/src/nodes/parquet_source/mod.rs:259 - 0ns

In this case you know that waiting for the parquet source was the bottleneck for the execution engine.

@github-actions github-actions bot added internal An internal refactor or improvement rust Related to Rust Polars labels Oct 22, 2024
Copy link

codecov bot commented Oct 22, 2024

Codecov Report

Attention: Patch coverage is 0% with 54 lines in your changes missing coverage. Please review.

Project coverage is 80.17%. Comparing base (eb596c9) to head (76fa967).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-stream/src/async_executor/mod.rs 0.00% 36 Missing ⚠️
crates/polars-stream/src/execute.rs 0.00% 9 Missing ⚠️
crates/polars-stream/src/skeleton.rs 0.00% 9 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #19373      +/-   ##
==========================================
- Coverage   80.19%   80.17%   -0.03%     
==========================================
  Files        1523     1523              
  Lines      209853   209911      +58     
  Branches     2434     2434              
==========================================
+ Hits       168300   168303       +3     
- Misses      40998    41053      +55     
  Partials      555      555              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ritchie46 ritchie46 merged commit c230071 into pola-rs:main Oct 22, 2024
20 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants