Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement](nereids) improve lots of values in insert into values statement #40202

Merged
merged 17 commits into from
Dec 23, 2024

Conversation

924060929
Copy link
Contributor

@924060929 924060929 commented Aug 30, 2024

Proposed changes

improve lots of values in insert into values statement by bypass NereidsPlanner

the main logic is

  1. InsertUtils.normalizePlan use FoldConstantRuleOnFE to reduce the expression, e.g. values(date(now())
  2. FastInsertIntoValuesPlanner skip most of rules to analyze and rewrite LogicalInlineTable to LogicalUnion or LogicalOneRowRelation
  3. fast parse date time string without date format
  4. getHintMap and normal lexer share the same tokens
  5. set enable_fast_analyze_into_values=false can force to execute all optimize rules, when we meet some bugs in FastInsertIntoValuesPlanner

test: insert 1000 rows with 1000 columns, the columns contains int, bigint, decimal(26,7), date, datetime, varchar(10 chinese chars)

FastInsertIntoValuesPlanner NereidsPlanner(enable_fast_analyze_into_values=false) Legacy optimizer in 2.1.6 Nereids planner in 2.1.6
16s(bottleneck is antlr's lexer) 32s 16s 80s

If you use FastInsertIntoValuesPlanner with group commit in a transaction, the time can reduce to 12s.

TODO: build a custom lexer. in my hand write lexer test, FastInsertIntoValuesPlanner without group commit can reduce 16s to 12s, but it will take more effort: RegularExpression -> NFA -> DFA -> minimal DFA -> Lexer codegen

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@924060929
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38038 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c221b9226095f726f9751c308f10241dab3aaa85, data reload: false

------ Round 1 ----------------------------------
q1	17640	4840	4305	4305
q2	2018	188	178	178
q3	11669	954	1149	954
q4	10517	749	768	749
q5	7737	2866	2811	2811
q6	228	139	141	139
q7	969	628	610	610
q8	9338	2074	2086	2074
q9	7097	6507	6528	6507
q10	7005	2145	2260	2145
q11	460	240	245	240
q12	406	238	236	236
q13	18291	3037	3054	3037
q14	283	234	230	230
q15	517	495	484	484
q16	599	529	503	503
q17	983	690	709	690
q18	7313	6926	6963	6926
q19	1399	983	1009	983
q20	666	338	332	332
q21	3913	3001	2893	2893
q22	1101	1032	1012	1012
Total cold run time: 110149 ms
Total hot run time: 38038 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4327	4361	4349	4349
q2	386	279	282	279
q3	2936	2661	2691	2661
q4	1991	1681	1676	1676
q5	5677	5715	5736	5715
q6	235	145	143	143
q7	2257	1867	1860	1860
q8	3310	3428	3457	3428
q9	8912	8910	8849	8849
q10	3628	3377	3394	3377
q11	596	520	507	507
q12	810	683	656	656
q13	15351	3332	3453	3332
q14	345	328	299	299
q15	548	504	496	496
q16	643	610	602	602
q17	1899	1560	1560	1560
q18	8712	8413	7960	7960
q19	2619	1593	1589	1589
q20	2178	1990	1910	1910
q21	5724	5513	5527	5513
q22	1132	1054	1025	1025
Total cold run time: 74216 ms
Total hot run time: 57786 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192483 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c221b9226095f726f9751c308f10241dab3aaa85, data reload: false

query1	1282	898	870	870
query2	6359	2041	1922	1922
query3	10604	3992	3919	3919
query4	60222	27296	23342	23342
query5	5485	515	479	479
query6	433	166	162	162
query7	5782	303	302	302
query8	297	216	206	206
query9	8961	2466	2462	2462
query10	488	268	261	261
query11	17233	15030	15251	15030
query12	155	104	100	100
query13	1556	396	379	379
query14	10831	7215	7040	7040
query15	250	175	186	175
query16	7645	456	487	456
query17	1114	629	611	611
query18	2087	309	315	309
query19	301	158	156	156
query20	125	112	109	109
query21	206	108	103	103
query22	4501	4408	4266	4266
query23	34243	33570	33340	33340
query24	5929	2879	2871	2871
query25	548	410	398	398
query26	698	161	162	161
query27	1778	288	281	281
query28	3816	2106	2090	2090
query29	730	427	417	417
query30	237	151	160	151
query31	935	764	767	764
query32	87	55	60	55
query33	486	299	278	278
query34	848	490	482	482
query35	838	724	711	711
query36	1059	941	931	931
query37	149	91	98	91
query38	3977	3868	3908	3868
query39	1471	1420	1399	1399
query40	198	121	122	121
query41	48	47	45	45
query42	116	98	97	97
query43	525	491	475	475
query44	1109	755	754	754
query45	203	165	168	165
query46	1091	788	765	765
query47	1867	1802	1795	1795
query48	371	310	302	302
query49	774	442	439	439
query50	809	424	441	424
query51	7207	7085	6983	6983
query52	102	87	88	87
query53	258	190	193	190
query54	580	467	462	462
query55	83	84	83	83
query56	280	269	271	269
query57	1161	1093	1081	1081
query58	227	243	238	238
query59	3083	2789	2733	2733
query60	304	280	364	280
query61	101	99	102	99
query62	758	654	657	654
query63	214	188	190	188
query64	2855	677	630	630
query65	3185	3117	3161	3117
query66	632	334	351	334
query67	15360	15537	15183	15183
query68	4397	563	571	563
query69	416	275	270	270
query70	1163	1124	1133	1124
query71	357	277	280	277
query72	6604	3954	4059	3954
query73	740	328	336	328
query74	9178	8819	8756	8756
query75	3383	2670	2701	2670
query76	1762	990	965	965
query77	539	324	316	316
query78	10175	9089	9287	9089
query79	1551	545	540	540
query80	1063	538	501	501
query81	563	230	238	230
query82	386	150	148	148
query83	195	144	144	144
query84	270	79	120	79
query85	932	292	302	292
query86	360	304	273	273
query87	4405	4301	4229	4229
query88	3368	2304	2303	2303
query89	395	289	284	284
query90	1942	200	190	190
query91	130	100	101	100
query92	63	50	52	50
query93	1944	547	545	545
query94	799	304	286	286
query95	347	265	261	261
query96	594	263	265	263
query97	3179	3090	3027	3027
query98	233	205	205	205
query99	1553	1281	1289	1281
Total cold run time: 310233 ms
Total hot run time: 192483 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.62 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c221b9226095f726f9751c308f10241dab3aaa85, data reload: false

query1	0.05	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.06	0.05
query4	1.66	0.10	0.09
query5	0.50	0.50	0.48
query6	1.12	0.72	0.74
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.55	0.50	0.48
query10	0.55	0.55	0.54
query11	0.15	0.12	0.12
query12	0.15	0.12	0.12
query13	0.61	0.59	0.59
query14	2.15	2.05	2.07
query15	0.90	0.81	0.83
query16	0.37	0.38	0.37
query17	0.98	1.04	0.97
query18	0.21	0.21	0.21
query19	1.91	1.82	1.78
query20	0.01	0.00	0.01
query21	15.40	0.66	0.65
query22	4.33	7.74	1.46
query23	18.29	1.49	1.39
query24	2.08	0.23	0.23
query25	0.15	0.08	0.06
query26	0.27	0.18	0.17
query27	0.08	0.08	0.08
query28	13.17	1.02	1.01
query29	12.62	3.30	3.31
query30	0.24	0.06	0.05
query31	2.86	0.41	0.40
query32	3.24	0.48	0.48
query33	2.99	3.04	2.98
query34	17.29	4.33	4.38
query35	4.46	4.44	4.48
query36	0.66	0.49	0.49
query37	0.18	0.16	0.15
query38	0.15	0.14	0.14
query39	0.04	0.04	0.04
query40	0.15	0.13	0.13
query41	0.10	0.05	0.04
query42	0.06	0.06	0.05
query43	0.05	0.04	0.05
Total cold run time: 111.11 s
Total hot run time: 31.62 s

@924060929 924060929 force-pushed the opt_insert_into_values branch from c221b92 to 1b6854e Compare November 22, 2024 03:42
@924060929
Copy link
Contributor Author

run buildall

5 similar comments
@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929 924060929 marked this pull request as ready for review November 29, 2024 06:38
@924060929 924060929 force-pushed the opt_insert_into_values branch from 2a30e59 to 4fe7f08 Compare December 6, 2024 04:05
@924060929
Copy link
Contributor Author

run buildall

3 similar comments
@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929 924060929 force-pushed the opt_insert_into_values branch from 61d85d6 to 5c8f0a7 Compare December 9, 2024 06:59
@924060929
Copy link
Contributor Author

run buildall

7 similar comments
@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929 924060929 force-pushed the opt_insert_into_values branch from 13e1f0f to bee14a8 Compare December 10, 2024 03:06
@924060929
Copy link
Contributor Author

run buildall

1 similar comment
@924060929
Copy link
Contributor Author

run buildall

@924060929 924060929 force-pushed the opt_insert_into_values branch from 0436c7f to 61d54ab Compare December 23, 2024 05:12
@924060929
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.88% (10131/26056)
Line Coverage: 29.81% (85254/286016)
Region Coverage: 28.93% (43533/150455)
Branch Coverage: 25.45% (22182/87152)
Coverage Report: http://coverage.selectdb-in.cc/coverage/61d54ab8624eb2d9d5f31330a07a6dfc51225ac6_61d54ab8624eb2d9d5f31330a07a6dfc51225ac6/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 39636 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 61d54ab8624eb2d9d5f31330a07a6dfc51225ac6, data reload: false

------ Round 1 ----------------------------------
q1	17578	7347	7237	7237
q2	2051	175	182	175
q3	10695	1065	1180	1065
q4	10571	705	765	705
q5	7587	2691	2620	2620
q6	243	152	151	151
q7	983	621	596	596
q8	9249	1876	1908	1876
q9	6601	6387	6384	6384
q10	6964	2340	2322	2322
q11	469	264	261	261
q12	435	224	225	224
q13	17767	2973	2981	2973
q14	248	214	207	207
q15	566	522	501	501
q16	665	597	602	597
q17	980	535	511	511
q18	7255	6715	6616	6616
q19	1369	993	930	930
q20	479	185	186	185
q21	4029	3463	3185	3185
q22	388	321	315	315
Total cold run time: 107172 ms
Total hot run time: 39636 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7257	7213	7251	7213
q2	326	224	231	224
q3	2949	2842	2926	2842
q4	2099	1840	1866	1840
q5	5667	5693	5652	5652
q6	227	136	147	136
q7	2240	1865	1779	1779
q8	3411	3558	3470	3470
q9	8802	8993	8911	8911
q10	3899	3575	3562	3562
q11	608	527	520	520
q12	883	614	612	612
q13	12284	3200	3093	3093
q14	308	266	296	266
q15	576	500	500	500
q16	692	641	642	641
q17	1849	1660	1636	1636
q18	8298	7908	7671	7671
q19	1705	1500	1443	1443
q20	2089	1865	1961	1865
q21	5631	5504	5429	5429
q22	619	578	591	578
Total cold run time: 72419 ms
Total hot run time: 59883 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197168 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 61d54ab8624eb2d9d5f31330a07a6dfc51225ac6, data reload: false

query1	1272	977	940	940
query2	6234	2436	2469	2436
query3	11079	4832	4848	4832
query4	33070	23405	23393	23393
query5	4624	486	460	460
query6	291	199	200	199
query7	3996	306	321	306
query8	310	242	236	236
query9	9425	2704	2703	2703
query10	477	273	252	252
query11	18158	15172	15034	15034
query12	163	103	110	103
query13	1620	436	425	425
query14	9927	7333	7601	7333
query15	271	199	201	199
query16	7560	479	470	470
query17	1755	584	585	584
query18	1565	313	309	309
query19	371	165	160	160
query20	118	115	113	113
query21	207	105	105	105
query22	4741	4661	4490	4490
query23	34659	35041	33851	33851
query24	10475	2497	2561	2497
query25	526	392	387	387
query26	695	157	153	153
query27	2178	339	345	339
query28	6538	2461	2420	2420
query29	663	424	435	424
query30	231	147	152	147
query31	1082	842	854	842
query32	94	59	55	55
query33	784	292	303	292
query34	1245	562	521	521
query35	934	775	803	775
query36	1140	958	978	958
query37	128	80	76	76
query38	4271	4227	4298	4227
query39	1495	1457	1472	1457
query40	207	102	103	102
query41	45	41	44	41
query42	120	111	102	102
query43	547	513	514	513
query44	1269	849	862	849
query45	205	175	175	175
query46	1209	730	728	728
query47	2079	1955	1942	1942
query48	428	339	337	337
query49	838	400	404	400
query50	845	394	400	394
query51	7354	7011	7135	7011
query52	100	94	89	89
query53	262	190	187	187
query54	1390	436	417	417
query55	86	79	79	79
query56	265	247	247	247
query57	1274	1179	1155	1155
query58	237	230	258	230
query59	3541	3275	3133	3133
query60	286	250	265	250
query61	130	102	111	102
query62	859	713	682	682
query63	233	203	195	195
query64	3710	676	646	646
query65	3311	3284	3264	3264
query66	822	322	312	312
query67	16516	15649	15467	15467
query68	5362	554	564	554
query69	477	254	256	254
query70	1215	1137	1133	1133
query71	493	254	271	254
query72	6688	4101	4143	4101
query73	808	366	361	361
query74	10506	8812	8912	8812
query75	3573	2645	2599	2599
query76	3767	1081	1097	1081
query77	668	281	274	274
query78	10357	9435	9346	9346
query79	2146	616	607	607
query80	1298	433	426	426
query81	521	229	221	221
query82	804	117	115	115
query83	206	157	149	149
query84	285	74	72	72
query85	1278	307	309	307
query86	431	297	315	297
query87	4722	4365	4484	4365
query88	3816	2235	2170	2170
query89	436	292	287	287
query90	2004	194	188	188
query91	140	107	104	104
query92	63	53	55	53
query93	2759	557	548	548
query94	756	296	249	249
query95	362	262	252	252
query96	642	282	281	281
query97	2883	2721	2665	2665
query98	222	189	199	189
query99	1613	1330	1302	1302
Total cold run time: 303841 ms
Total hot run time: 197168 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.48 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 61d54ab8624eb2d9d5f31330a07a6dfc51225ac6, data reload: false

query1	0.03	0.03	0.03
query2	0.09	0.04	0.03
query3	0.23	0.08	0.07
query4	1.61	0.10	0.10
query5	0.42	0.39	0.42
query6	1.17	0.65	0.64
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.59	0.51	0.50
query10	0.54	0.57	0.55
query11	0.15	0.10	0.11
query12	0.14	0.11	0.12
query13	0.62	0.61	0.60
query14	2.76	2.87	2.74
query15	0.91	0.83	0.83
query16	0.38	0.40	0.38
query17	1.05	1.06	1.02
query18	0.23	0.21	0.20
query19	1.90	1.75	1.95
query20	0.02	0.00	0.01
query21	15.35	0.60	0.57
query22	2.46	2.22	2.11
query23	17.00	0.93	0.80
query24	3.48	0.76	0.61
query25	0.26	0.12	0.05
query26	0.44	0.14	0.14
query27	0.05	0.04	0.03
query28	10.98	1.11	1.08
query29	12.56	3.25	3.23
query30	0.25	0.07	0.06
query31	2.85	0.39	0.38
query32	3.23	0.46	0.46
query33	3.04	3.07	3.06
query34	16.95	4.52	4.52
query35	4.52	4.51	4.50
query36	0.68	0.48	0.51
query37	0.09	0.06	0.07
query38	0.05	0.03	0.04
query39	0.04	0.02	0.03
query40	0.18	0.13	0.13
query41	0.08	0.03	0.03
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 107.5 s
Total hot run time: 32.48 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.89% (10134/26056)
Line Coverage: 29.81% (85280/286039)
Region Coverage: 28.93% (43537/150483)
Branch Coverage: 25.46% (22189/87164)
Coverage Report: http://coverage.selectdb-in.cc/coverage/61d54ab8624eb2d9d5f31330a07a6dfc51225ac6_61d54ab8624eb2d9d5f31330a07a6dfc51225ac6/report/index.html

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 23, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@morrySnow morrySnow merged commit 81f3c48 into apache:master Dec 23, 2024
22 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants