Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregator support prefetch #9679

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

guo-shaoge
Copy link
Contributor

@guo-shaoge guo-shaoge commented Nov 28, 2024

What problem does this PR solve?

Issue Number: close #9680

Problem Summary:

What is changed and how it works?

  1. Support prefetch for HashTable. This PR mainly works for key32/key64/key128/key256, and it doesn't affect key_string/key_serialized

Benchmark

Workloads: TPCH-50G
Queries:

 # --q0 Q1-1: key_int64; distinct rate: 10M/300M; HashMap
  "select /*+ mpp_1phase_agg() */ sum(l_quantity) as csum, l_partkey from lineitem group by l_partkey having csum < 200;"
  # --q1 Q1-2: key_int64; 75M/300M; HashMp
  "select /*+ mpp_1phase_agg() */ sum(l_quantity) as csum, l_orderkey from lineitem group by  L_orderkey having csum < 1;"
  # --q2 Q1-3: key_int64; 7/300M; HashMap
  "select /*+ mpp_1phase_agg() */ sum(l_quantity), l_linenumber from lineitem group by  l_linenumber;"

  # --q3 Q2-1: one_key_strbinpadding_phmap; 2/300M; StringHashMap
  "select /*+ mpp_1phase_agg() */ sum(l_discount), l_linestatus from lineitem group by l_linestatus;"
  # --q4 Q2-2: one_key_strbinpadding; 104M/300M; StringHashMap
  "select /*+ mpp_1phase_agg() */ sum(l_discount) as csum from lineitem group by l_comment having csum > 800;"

  # --q4 Q3-1: key_serialized as group by method;  33/300M; HashMap with StringRef key
  "select /*+ mpp_1phase_agg() */ sum(l_discount), l_returnflag from lineitem group by l_returnflag, l_discount;"
  # --q6 Q3-2: key_serialized as group by method; 77M/300M; HashMap with StringRef key
  "select /*+ mpp_1phase_agg() */ sum(l_discount) as csum, l_returnflag from lineitem group by l_returnflag, l_discount, l_extendedprice having csum > 100;"


  # --q7 Q4-1: two_keys_num64_strbinpadding: 21/300M; HashMap with StringRef key
  "select /*+ mpp_1phase_agg() */ sum(l_discount) from lineitem group by l_returnflag, L_LINENUMBER;"
  # --q8 Q4-2: two_keys_num64_strbinpadding; 29.9M/300M; HashMap with StringRef key
  "select /*+ mpp_1phase_agg() */ sum(l_discount) as csum, l_partkey from lineitem group by l_returnflag, l_partkey having csum > 100;"

  # --q9 Q5-1: keys_128; 77/300M; HashMap with UInt128 key
  "select /*+ mpp_1phase_agg() */ sum(l_linenumber), l_discount from lineitem group by l_linenumber, l_discount;"
  # --q10 Q5-2: keys_128; 5M/300M; HashMap with UInt128 key
  "select /*+ mpp_1phase_agg() */ sum(l_linenumber) as csum, l_discount from lineitem group by l_suppkey, l_discount having csum > 800;"

  # --q11 Q6-1: key_string; 4/300M; StringHashMap
   "select /*+ mpp_1phase_agg() */ sum(l_quantity), l_shipinstruct from lineitem group by l_shipinstruct;"

  # --q12 Q7-1: keys_256; 290M/300M; HashMap;
  "select /*+ mpp_1phase_agg() */ sum(l_suppkey) as csum from lineitem group by l_suppkey, l_tax, l_discount, l_partkey having csum = 10;"

  # --q13 clickbench Q4
  "select count(distinct userid) from clickbench.hits;"

ENV:

  1. Internal benchbot env
  2. 1 tidb; 2 tikv; 1 PD; 2 tiflash(16C64G, amd64)

Results (google excel):

idx query name nightly-b16a5f93 this PR - opt1 this PR - opt2 rate-1 rate-2
0 Q1-1 2.71 2.05 1.8 24.35% 33.58%
1 Q1-2 2.15 2.07 1.62 3.72% 24.65%
2 Q1-3 1.33 1.4 1.31 -5.26% 1.50%
9 Q5-1 1.37 1.35 1.37 1.46% 0.00%
10 Q5-2 4.08 3.86 3.59 5.39% 12.01%
12 Q7-1 13.3 12.1 11.2 9.02% 15.79%
13 clickbench Q4 0.436 0.383 0.391 12.16% 10.32%

NOTE:

  1. Query duration is in seconds
  2. Queries like Q2-x/Q3-x are not listed because they are key_stirng/key_serialized, this PR only works for key_int64/key_int128/key_int256...). So these queries should not be affected.
  3. opt-1 means only prefetch HashTable.
  4. opt-2 means(this is the final improvement of this PR)
    1. prefetch HashTable
    2. prefetch threshold be 2MB in byte instead of 8192 buckets
    3. prefetch when insertResultInto columns
    4. prefetch aggregation data for already exists key in HashTable
    5. mini batch(step is 256 for now) when emplace HashTable
  5. ClickBench Q4 is from: SELECT COUNT(DISTINCT UserID) FROM hits; https://github.com/ClickHouse/ClickBench/blob/main/mysql/queries.sql

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. labels Nov 28, 2024
Copy link
Contributor

ti-chi-bot bot commented Nov 28, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from guo-shaoge, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed do-not-merge/needs-linked-issue labels Nov 28, 2024
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
@guo-shaoge guo-shaoge changed the title Aggregator support prefetch Aggregator support prefetch and new hasher Dec 2, 2024
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
@guo-shaoge
Copy link
Contributor Author

/retest

Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
This reverts commit 3e30f95.
Signed-off-by: guo-shaoge <[email protected]>
private:
ALWAYS_INLINE inline StringRef getKey(size_t row) const
{
auto last_offset = row == 0 ? 0 : offsets[row - 1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just use auto last_offset = offsets[row - 1] is ok

dbms/src/Interpreters/Aggregator.cpp Outdated Show resolved Hide resolved
dbms/src/Interpreters/Aggregator.cpp Show resolved Hide resolved
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
@guo-shaoge
Copy link
Contributor Author

/retest

@@ -851,6 +854,11 @@ class HashTable

iterator end() { return iterator(this, buf ? buf + grower.bufSize() : buf); }

void ALWAYS_INLINE prefetch(size_t hashval) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void ALWAYS_INLINE prefetch(size_t hashval) const
void ALWAYS_INLINE prefetchRead(size_t hashval) const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think prefetch is simple and clear

dbms/src/Interpreters/Aggregator.cpp.orig Outdated Show resolved Hide resolved
dbms/src/Interpreters/Aggregator.cpp Show resolved Hide resolved
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
@guo-shaoge guo-shaoge force-pushed the hashagg_prefetch branch 2 times, most recently from 8294bf5 to e33075a Compare December 22, 2024 08:04
Signed-off-by: guo-shaoge <[email protected]>
@@ -39,6 +39,7 @@ extern const char exception_before_mpp_root_task_run[];
extern const char exception_during_mpp_non_root_task_run[];
extern const char exception_during_mpp_root_task_run[];
extern const char exception_during_query_run[];
extern const char force_agg_prefetch[];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: why force_agg_prefetch in ComputeServerRunner.testErrorMessage will hang

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from @windtalker : setCancelTest() will make table scan output infinite output data, so this case will run forever if this case doesn't report error

Signed-off-by: guo-shaoge <[email protected]>
@guo-shaoge guo-shaoge requested a review from gengliqi December 24, 2024 13:21
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
template <bool collect_hit_rate, bool only_lookup, typename Method>
ALWAYS_INLINE void Aggregator::executeImplBatch(
template <typename Method>
ALWAYS_INLINE inline size_t getCurrentHashAndSetUpPrefetchHash(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ALWAYS_INLINE inline size_t getCurrentHashAndSetUpPrefetchHash(
ALWAYS_INLINE inline size_t getCurrentRowHashAndDoPrefetch(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregator support prefetch
3 participants