refactor(rust): Add tracking of async task wait time statistics #19373

orlp · 2024-10-22T07:56:23Z

A profiler is useless in an async engine if something is a blocking bottleneck - all threads will just show spending a lot of time in condvar waiting inside the executor loop.

With this PR if you set POLARS_TRACK_WAIT_STATS=1 we track the time between right before we park a thread, and when that thread finds a new task, and adds the time spent blocked to that tasks metadata. After a new streaming query we print these stats, which can look like this:

Time spent waiting for async tasks:
crates/polars-stream/src/nodes/parquet_source/mod.rs:230 - 971.347764ms
crates/polars-stream/src/nodes/parquet_source/init.rs:179 - 493.140887ms
crates/polars-stream/src/nodes/in_memory_source.rs:75 - 16.806251ms
crates/polars-stream/src/nodes/in_memory_sink.rs:58 - 4.832292ms
crates/polars-stream/src/nodes/parquet_source/metadata_fetch.rs:123 - 0ns
crates/polars-stream/src/nodes/group_by.rs:47 - 0ns
crates/polars-stream/src/nodes/parquet_source/mod.rs:259 - 0ns

In this case you know that waiting for the parquet source was the bottleneck for the execution engine.

codecov · 2024-10-22T08:22:27Z

Codecov Report

Attention: Patch coverage is 0% with 54 lines in your changes missing coverage. Please review.

Project coverage is 80.17%. Comparing base (eb596c9) to head (76fa967).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
crates/polars-stream/src/async_executor/mod.rs	0.00%	36 Missing ⚠️
crates/polars-stream/src/execute.rs	0.00%	9 Missing ⚠️
crates/polars-stream/src/skeleton.rs	0.00%	9 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #19373      +/-   ##
==========================================
- Coverage   80.19%   80.17%   -0.03%     
==========================================
  Files        1523     1523              
  Lines      209853   209911      +58     
  Branches     2434     2434              
==========================================
+ Hits       168300   168303       +3     
- Misses      40998    41053      +55     
  Partials      555      555

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

refactor(rust): Add tracking of async task wait time statistics

76fa967

orlp requested review from ritchie46 and c-peters as code owners October 22, 2024 07:56

github-actions bot added internal An internal refactor or improvement rust Related to Rust Polars labels Oct 22, 2024

ritchie46 merged commit c230071 into pola-rs:main Oct 22, 2024
20 of 21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(rust): Add tracking of async task wait time statistics #19373

refactor(rust): Add tracking of async task wait time statistics #19373

orlp commented Oct 22, 2024

codecov bot commented Oct 22, 2024 •

edited

Loading

refactor(rust): Add tracking of async task wait time statistics #19373

refactor(rust): Add tracking of async task wait time statistics #19373

Conversation

orlp commented Oct 22, 2024

codecov bot commented Oct 22, 2024 • edited Loading

Codecov Report

codecov bot commented Oct 22, 2024 •

edited

Loading