Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [opt](parquet-reader)Implement late materialization of parquet complex types. #44098 #45985

Open
wants to merge 1 commit into
base: branch-3.0
Choose a base branch
from

Conversation

github-actions[bot]
Copy link
Contributor

Cherry-picked from #44098

…x types. (#44098)

### What problem does this PR solve?

Problem Summary:
Late materialization is not supported when querying fields with complex
types.

### Release note
[opt](parquet-reader)Implement late materialization of parquet complex types.
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Dec 26, 2024
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40995 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3c9b7d5ea42bff453188123fe58aa29550122a61, data reload: false

------ Round 1 ----------------------------------
q1	17605	7452	7266	7266
q2	2059	166	181	166
q3	10577	1112	1181	1112
q4	10564	725	770	725
q5	7772	2923	2836	2836
q6	242	151	147	147
q7	999	637	621	621
q8	9369	2017	2010	2010
q9	6607	6445	6480	6445
q10	7001	2326	2331	2326
q11	462	269	261	261
q12	414	212	212	212
q13	17788	2998	3042	2998
q14	245	210	211	210
q15	563	527	510	510
q16	681	622	614	614
q17	992	563	567	563
q18	7215	6629	6743	6629
q19	1377	1064	960	960
q20	473	199	196	196
q21	4187	3185	3306	3185
q22	1109	1017	1003	1003
Total cold run time: 108301 ms
Total hot run time: 40995 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7280	7307	7319	7307
q2	324	232	230	230
q3	2961	2960	2973	2960
q4	2070	1869	1811	1811
q5	5761	5734	5811	5734
q6	230	140	142	140
q7	2208	1842	1792	1792
q8	3346	3571	3478	3478
q9	9002	8944	8904	8904
q10	3602	3588	3579	3579
q11	606	500	499	499
q12	873	646	609	609
q13	10432	3238	3204	3204
q14	324	286	295	286
q15	584	525	521	521
q16	722	665	655	655
q17	1841	1622	1610	1610
q18	8291	7721	7458	7458
q19	1711	1469	1599	1469
q20	2131	1878	1853	1853
q21	5556	5365	5428	5365
q22	1209	1056	1024	1024
Total cold run time: 71064 ms
Total hot run time: 60488 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 199127 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3c9b7d5ea42bff453188123fe58aa29550122a61, data reload: false

query1	1273	944	904	904
query2	6209	2096	2040	2040
query3	10903	4473	4484	4473
query4	67071	28817	23851	23851
query5	4919	437	442	437
query6	403	172	186	172
query7	5686	311	321	311
query8	310	231	238	231
query9	9250	2704	2676	2676
query10	487	270	251	251
query11	17669	15377	15893	15377
query12	160	98	101	98
query13	1570	442	427	427
query14	10758	7123	6895	6895
query15	206	180	180	180
query16	7324	481	517	481
query17	1119	627	597	597
query18	1662	332	326	326
query19	236	158	153	153
query20	115	110	116	110
query21	58	45	49	45
query22	4826	4887	4668	4668
query23	34719	34826	34937	34826
query24	6168	2922	2991	2922
query25	524	422	406	406
query26	671	167	170	167
query27	1877	312	321	312
query28	4228	2545	2511	2511
query29	692	463	462	462
query30	259	179	166	166
query31	1028	807	864	807
query32	63	55	57	55
query33	430	284	276	276
query34	930	514	511	511
query35	845	755	741	741
query36	1128	952	1004	952
query37	118	71	75	71
query38	4094	4077	4031	4031
query39	1534	1507	1472	1472
query40	140	82	80	80
query41	51	46	45	45
query42	107	97	104	97
query43	535	492	511	492
query44	1202	822	840	822
query45	191	171	171	171
query46	1138	739	725	725
query47	2047	1944	1972	1944
query48	475	389	399	389
query49	729	389	389	389
query50	839	433	422	422
query51	7333	7258	7296	7258
query52	94	83	82	82
query53	258	182	178	178
query54	550	439	448	439
query55	77	70	76	70
query56	262	237	245	237
query57	1227	1159	1123	1123
query58	204	201	203	201
query59	3199	2970	2916	2916
query60	279	255	255	255
query61	109	103	108	103
query62	784	673	680	673
query63	213	193	192	192
query64	1433	648	622	622
query65	3283	3182	3186	3182
query66	627	300	302	300
query67	16071	15818	15627	15627
query68	4259	572	554	554
query69	409	266	261	261
query70	1205	1119	1064	1064
query71	325	254	257	254
query72	6132	4061	3971	3971
query73	755	345	343	343
query74	10224	9061	8999	8999
query75	3367	2652	2634	2634
query76	1757	1123	1120	1120
query77	476	270	270	270
query78	10633	9930	9750	9750
query79	1469	624	600	600
query80	862	453	418	418
query81	526	245	237	237
query82	1055	112	112	112
query83	160	139	154	139
query84	284	83	73	73
query85	838	302	285	285
query86	340	280	294	280
query87	4528	4390	4303	4303
query88	3740	2446	2353	2353
query89	419	290	286	286
query90	1930	182	178	178
query91	185	143	146	143
query92	68	50	48	48
query93	1797	545	546	545
query94	732	266	273	266
query95	347	243	249	243
query96	608	285	274	274
query97	3360	3198	3233	3198
query98	213	205	205	205
query99	1613	1314	1296	1296
Total cold run time: 319595 ms
Total hot run time: 199127 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.08 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 3c9b7d5ea42bff453188123fe58aa29550122a61, data reload: false

query1	0.04	0.03	0.02
query2	0.07	0.03	0.03
query3	0.23	0.06	0.07
query4	1.62	0.10	0.10
query5	0.51	0.52	0.51
query6	1.13	0.72	0.73
query7	0.02	0.02	0.01
query8	0.04	0.05	0.03
query9	0.57	0.52	0.49
query10	0.56	0.56	0.56
query11	0.14	0.11	0.10
query12	0.14	0.11	0.11
query13	0.60	0.60	0.60
query14	2.91	3.07	3.03
query15	0.90	0.82	0.82
query16	0.38	0.39	0.38
query17	1.00	1.05	1.00
query18	0.24	0.21	0.20
query19	1.95	1.81	1.91
query20	0.02	0.01	0.00
query21	15.37	0.61	0.58
query22	2.33	2.56	2.01
query23	16.81	1.16	0.70
query24	3.37	0.92	2.08
query25	0.22	0.21	0.12
query26	0.42	0.15	0.14
query27	0.05	0.04	0.04
query28	9.82	1.10	1.08
query29	12.56	3.27	3.23
query30	0.25	0.05	0.06
query31	2.88	0.38	0.38
query32	3.27	0.45	0.46
query33	3.04	3.05	3.02
query34	17.04	4.49	4.57
query35	4.54	4.48	4.53
query36	0.65	0.51	0.51
query37	0.10	0.06	0.06
query38	0.04	0.03	0.03
query39	0.04	0.02	0.02
query40	0.16	0.12	0.12
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.04
Total cold run time: 106.16 s
Total hot run time: 33.08 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants