Support serializing packed tables directly for the normal shuffle path #10

firestarman · 2024-05-28T02:27:51Z

This PR is trying to accelerate the normal shuffle path by partitioning and slicing tables on GPU.

The sliced table is already serializable so can be written to the Shuffle output stream directly, along with a lightweight metadata (a TableMeta) to rebuild the table on the Shuffle read side.

On the Shuffle read side, the new introduced PackedTableIterator will read the tables from the Shuffle input stream and rebuild them on GPU by leveraging the existing utils (MetaUtils, GpuCompressedColumnVector). Next, the existing GpuCoalesceBatches node is used to do the batch concatenation for the downstream operators, similar as what Rapids Shuffle does.

It led to some perf degression in NDS runs, so disable this feature by default. But we got about 2x speedup for a customer query. So we can still add in this feature and enable it explicitly for suitable cases.

--------- Signed-off-by: Firestarman <[email protected]>

Signed-off-by: Firestarman <[email protected]>

…uffle-gpu-serde Signed-off-by: Firestarman <[email protected]>

Signed-off-by: Firestarman <[email protected]>

firestarman added 8 commits May 14, 2024 14:42

Support serializing packed tables directly for shuffle write

3a984f2

--------- Signed-off-by: Firestarman <[email protected]>

Disble GPU serde for the AQE tests

baadb4b

Signed-off-by: Firestarman <[email protected]>

Disable by default

11e933d

Signed-off-by: Firestarman <[email protected]>

Fix a build error

6e8bb5c

Signed-off-by: Firestarman <[email protected]>

Address comments

d6082ae

Signed-off-by: Firestarman <[email protected]>

Merge branch 'branch-24.06' of github.com:NVIDIA/spark-rapids into sh…

0419224

…uffle-gpu-serde Signed-off-by: Firestarman <[email protected]>

Support buffering small tables for Shuffle read

99820e1

Signed-off-by: Firestarman <[email protected]>

Merge branch 'shuffle-gpu-serde' into 0527-base

c203bc3

nvliyuan merged commit f168a27 into nvliyuan:0527-base-local May 28, 2024
3 of 44 checks passed

firestarman deleted the gpu-serde branch June 4, 2024 03:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support serializing packed tables directly for the normal shuffle path #10

Support serializing packed tables directly for the normal shuffle path #10

firestarman commented May 28, 2024

Support serializing packed tables directly for the normal shuffle path #10

Support serializing packed tables directly for the normal shuffle path #10

Conversation

firestarman commented May 28, 2024