Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](Nereids) set correct sort key for aggregate #45369

Merged
merged 10 commits into from
Dec 18, 2024

Conversation

englefly
Copy link
Contributor

@englefly englefly commented Dec 12, 2024

What problem does this PR solve?

in previous #44042, we supported more patterns for PushTopnToAgg rule.
the new pattern:
topn
+-->agg(global)
+-->shuffle
+-->agg(local)

In order to support this new pattern, the group by keys and orderkeys are identical, but group keys can be in different order.
that is
topn(orderkey=[B,A])->agg(groupkey=[A,B,C])
=>
topn(orderkey=[B, A, C]) ->agg(groupKey=[A, B, C])

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly
Copy link
Contributor Author

run buildall

2 similar comments
@englefly
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor Author

run buildall

…r_distinct_through_join_one_side_cust.groovy
@englefly
Copy link
Contributor Author

run buildall

starocean999
starocean999 previously approved these changes Dec 18, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 18, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Dec 18, 2024
@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40064 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 25f1894ceff11d8f89e9bfc92777829008bcc9f9, data reload: false

------ Round 1 ----------------------------------
q1	17571	7517	7315	7315
q2	2048	177	170	170
q3	10608	1077	1209	1077
q4	10573	745	707	707
q5	7610	2753	2744	2744
q6	242	155	147	147
q7	988	625	600	600
q8	9270	1869	1909	1869
q9	6673	6452	6428	6428
q10	7022	2283	2351	2283
q11	476	268	258	258
q12	418	234	220	220
q13	17790	2983	2991	2983
q14	252	215	212	212
q15	577	510	509	509
q16	659	594	599	594
q17	1004	610	638	610
q18	7315	6619	6740	6619
q19	1354	1077	961	961
q20	461	180	178	178
q21	4010	3344	3269	3269
q22	379	315	311	311
Total cold run time: 107300 ms
Total hot run time: 40064 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7256	7255	7255	7255
q2	331	233	232	232
q3	2972	2807	2972	2807
q4	2117	1837	1887	1837
q5	5703	5735	5676	5676
q6	228	145	148	145
q7	2229	1859	1821	1821
q8	3461	3609	3543	3543
q9	8978	9018	8949	8949
q10	3628	3552	3560	3552
q11	609	526	523	523
q12	849	612	608	608
q13	11575	3098	3058	3058
q14	325	276	270	270
q15	562	507	503	503
q16	699	666	655	655
q17	1851	1644	1650	1644
q18	8347	7772	7652	7652
q19	1789	1630	1656	1630
q20	2042	1827	1888	1827
q21	5782	5616	5507	5507
q22	650	608	626	608
Total cold run time: 71983 ms
Total hot run time: 60302 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196310 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 25f1894ceff11d8f89e9bfc92777829008bcc9f9, data reload: false

query1	1278	1040	934	934
query2	6248	2217	2286	2217
query3	11145	4791	4674	4674
query4	33682	23431	23437	23431
query5	4017	481	473	473
query6	301	204	184	184
query7	3997	306	305	305
query8	293	240	224	224
query9	9363	2713	2712	2712
query10	475	237	250	237
query11	17760	15079	15249	15079
query12	151	110	105	105
query13	1553	400	418	400
query14	9823	7590	7402	7402
query15	306	197	192	192
query16	8256	495	477	477
query17	1742	654	628	628
query18	2174	322	317	317
query19	371	166	179	166
query20	122	114	117	114
query21	208	110	107	107
query22	4900	4582	4310	4310
query23	35727	33762	33737	33737
query24	11549	2566	2537	2537
query25	653	426	392	392
query26	1776	152	153	152
query27	2917	335	330	330
query28	7737	2506	2503	2503
query29	1029	433	417	417
query30	244	149	149	149
query31	1068	824	861	824
query32	97	58	58	58
query33	773	298	285	285
query34	1031	534	549	534
query35	879	800	760	760
query36	1136	978	955	955
query37	278	74	79	74
query38	4249	4330	4120	4120
query39	1538	1449	1465	1449
query40	251	104	104	104
query41	44	44	44	44
query42	118	106	98	98
query43	530	478	483	478
query44	1430	847	844	844
query45	199	180	171	171
query46	1205	732	741	732
query47	2001	1931	1904	1904
query48	425	324	336	324
query49	1205	413	411	411
query50	854	388	391	388
query51	7460	7294	7161	7161
query52	109	92	99	92
query53	266	185	196	185
query54	1168	430	446	430
query55	83	84	78	78
query56	249	259	250	250
query57	1289	1174	1169	1169
query58	240	227	217	217
query59	3339	3049	3165	3049
query60	278	233	269	233
query61	107	104	108	104
query62	845	668	668	668
query63	225	197	185	185
query64	4801	667	628	628
query65	3272	3238	3228	3228
query66	1171	302	304	302
query67	15852	15422	15337	15337
query68	5750	557	555	555
query69	403	250	259	250
query70	1175	1171	1133	1133
query71	398	254	248	248
query72	6678	4004	4033	4004
query73	779	363	367	363
query74	9940	8728	8972	8728
query75	3416	2666	2698	2666
query76	3358	1059	1137	1059
query77	419	278	276	276
query78	10335	9818	9364	9364
query79	1768	625	601	601
query80	863	433	434	433
query81	542	230	233	230
query82	554	122	120	120
query83	253	161	149	149
query84	230	73	78	73
query85	1562	307	309	307
query86	472	301	306	301
query87	4421	4310	4397	4310
query88	4116	2219	2195	2195
query89	425	289	291	289
query90	2038	192	185	185
query91	138	106	102	102
query92	69	51	52	51
query93	2095	544	543	543
query94	766	282	296	282
query95	348	249	248	248
query96	635	287	284	284
query97	2843	2675	2724	2675
query98	222	198	191	191
query99	1523	1341	1317	1317
Total cold run time: 308396 ms
Total hot run time: 196310 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.69 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 25f1894ceff11d8f89e9bfc92777829008bcc9f9, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.04	0.03
query3	0.24	0.07	0.06
query4	1.62	0.11	0.11
query5	0.42	0.41	0.40
query6	1.16	0.66	0.65
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.59	0.50	0.49
query10	0.57	0.58	0.55
query11	0.15	0.11	0.10
query12	0.13	0.10	0.11
query13	0.61	0.61	0.58
query14	2.76	2.73	2.77
query15	0.90	0.84	0.83
query16	0.40	0.39	0.38
query17	1.05	1.00	1.04
query18	0.22	0.21	0.22
query19	1.83	1.80	1.87
query20	0.01	0.01	0.01
query21	15.36	0.60	0.60
query22	2.94	2.09	1.66
query23	16.96	0.87	0.91
query24	3.21	1.21	1.28
query25	0.18	0.24	0.05
query26	0.44	0.13	0.14
query27	0.05	0.04	0.04
query28	10.32	1.13	1.09
query29	12.58	3.25	3.25
query30	0.24	0.06	0.06
query31	2.88	0.40	0.39
query32	3.23	0.47	0.46
query33	3.16	3.14	3.11
query34	17.03	4.46	4.46
query35	4.48	4.45	4.43
query36	0.68	0.48	0.48
query37	0.09	0.07	0.06
query38	0.04	0.04	0.03
query39	0.04	0.02	0.03
query40	0.17	0.12	0.12
query41	0.08	0.02	0.02
query42	0.03	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 107.03 s
Total hot run time: 32.69 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 18, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@englefly englefly merged commit a886463 into apache:master Dec 18, 2024
29 of 30 checks passed
@englefly englefly deleted the fix-limit-topn branch December 18, 2024 06:33
englefly added a commit that referenced this pull request Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.4-merged p0_b reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants