To support incremental scan ranges deployment. #50196

dirtysalt · 2024-08-23T08:09:37Z

Enhancement

In practice we have found that it can take a lot of time on the FE to get all the files that need to be scanned, and then send them down to the BE for execution. This is fine for native table, but not ideal for hive/iceberg/hudi/deltalake.

Ideally, the FE should start executing on the BE after scanning some of the files, while the FE continues to scan the rest of the files. This way, the whole execution process can be executed in parallel on both the FE and the BE, thus shortening the execution time.

Arch diagram:

two new session variables:

enable_connector_incremental_scan_ranges=true (if to enable incremental scan ranges deployment)
connector_incremental_scan_ranges_size=50 (if enabled, how many scan ranges delivered each round)

To achieve this goal, the following things need to be done.

the FE should support incremental acquisition of scanned files.
incremental deployment of files to the BE should be supported on the FE.
- [Enhancement] support incremental scan ranges deployment at FE side #50189
the BE should support incremental execution of files.
- [Enhancement] support incremental scan ranges deployment at BE side #50254

The text was updated successfully, but these errors were encountered:

dirtysalt added the type/enhancement Make an enhancement to StarRocks label Aug 23, 2024

This was referenced Aug 23, 2024

[Enhancement] support incremental scan ranges deployment at FE side #50189

Merged

[Refactor] support incremental scan ranges deployment #50198

Closed

[Enhancement] support incremental scan ranges deployment at BE side #50254

Merged

dirtysalt closed this as completed in #50189 Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

To support incremental scan ranges deployment. #50196

To support incremental scan ranges deployment. #50196

dirtysalt commented Aug 23, 2024 •

edited

Loading

To support incremental scan ranges deployment. #50196

To support incremental scan ranges deployment. #50196

Comments

dirtysalt commented Aug 23, 2024 • edited Loading

Enhancement

dirtysalt commented Aug 23, 2024 •

edited

Loading