Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [fix](iceberg)Bring field_id with parquet files And fix map type's key optional #44470 #44827

Merged
merged 1 commit into from
Dec 2, 2024

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Dec 1, 2024

Cherry-picked from #44470

…y optional (#44470)

### What problem does this PR solve?

1. Column IDs are required to be stored as [field
IDs](http://github.com/apache/parquet-format/blob/40699d05bd24181de6b1457babbee2c16dce3803/src/main/thrift/parquet.thrift#L459)
on the parquet schema.
ref: https://iceberg.apache.org/spec/?h=field+id#parquet
So, we should add field ids.
2. For `MapType`, its key is always required.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Dec 1, 2024
@dataroaring dataroaring reopened this Dec 1, 2024
@doris-robot
Copy link

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40465 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a82fac0a57aca6b3341898a488f0f800375b1eb8, data reload: false

------ Round 1 ----------------------------------
q1	18170	8134	7361	7361
q2	2604	170	184	170
q3	11432	1110	1160	1110
q4	10727	808	781	781
q5	8100	2849	2835	2835
q6	248	156	156	156
q7	998	618	610	610
q8	9764	1850	1944	1850
q9	6699	6380	6354	6354
q10	7010	2274	2302	2274
q11	472	261	259	259
q12	401	222	206	206
q13	17812	2984	2967	2967
q14	243	215	211	211
q15	554	509	512	509
q16	692	596	602	596
q17	963	565	523	523
q18	7350	6593	6490	6490
q19	1630	986	1061	986
q20	478	195	201	195
q21	3861	3170	3048	3048
q22	1062	1005	974	974
Total cold run time: 111270 ms
Total hot run time: 40465 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7479	7256	7200	7200
q2	336	226	244	226
q3	3186	2955	2855	2855
q4	1967	1761	1785	1761
q5	5554	5647	5648	5647
q6	224	143	142	142
q7	2146	1792	1771	1771
q8	3295	3441	3450	3441
q9	8858	8840	8780	8780
q10	3548	3502	3532	3502
q11	594	485	522	485
q12	821	623	595	595
q13	17044	3018	3004	3004
q14	286	269	256	256
q15	547	500	491	491
q16	681	641	644	641
q17	1783	1577	1570	1570
q18	7740	7437	7318	7318
q19	4227	1618	1590	1590
q20	2018	1783	1788	1783
q21	5163	4975	5085	4975
q22	1079	983	955	955
Total cold run time: 78576 ms
Total hot run time: 58988 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187224 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a82fac0a57aca6b3341898a488f0f800375b1eb8, data reload: false

query1	970	380	371	371
query2	6497	2028	1992	1992
query3	6701	211	209	209
query4	33850	23336	23467	23336
query5	4328	436	440	436
query6	249	169	157	157
query7	4603	308	312	308
query8	283	219	221	219
query9	9675	2641	2630	2630
query10	450	257	259	257
query11	18140	15135	15143	15135
query12	153	98	99	98
query13	1626	435	417	417
query14	9267	6392	6200	6200
query15	206	175	168	168
query16	7792	456	452	452
query17	1381	569	556	556
query18	1971	309	308	308
query19	189	153	145	145
query20	113	113	109	109
query21	201	104	102	102
query22	4668	4243	4322	4243
query23	34549	33793	33816	33793
query24	12439	2778	2849	2778
query25	729	400	408	400
query26	1869	160	163	160
query27	3134	300	299	299
query28	8437	2478	2458	2458
query29	1195	444	436	436
query30	324	167	158	158
query31	1019	771	787	771
query32	109	57	60	57
query33	778	287	275	275
query34	1008	493	504	493
query35	850	740	723	723
query36	1086	956	960	956
query37	282	71	70	70
query38	3933	3851	3845	3845
query39	1458	1402	1417	1402
query40	288	98	103	98
query41	52	48	48	48
query42	113	100	100	100
query43	524	484	484	484
query44	1200	796	788	788
query45	183	169	166	166
query46	1145	732	701	701
query47	1873	1807	1826	1807
query48	449	365	361	361
query49	1284	398	387	387
query50	804	404	416	404
query51	7061	7041	7099	7041
query52	108	88	96	88
query53	256	184	180	180
query54	1256	461	448	448
query55	78	79	78	78
query56	263	249	243	243
query57	1211	1118	1113	1113
query58	239	202	209	202
query59	3187	2745	3132	2745
query60	293	265	264	264
query61	102	97	100	97
query62	833	665	663	663
query63	211	180	177	177
query64	5208	635	591	591
query65	3263	3168	3176	3168
query66	1438	321	298	298
query67	15929	15075	15495	15075
query68	4973	551	543	543
query69	417	250	249	249
query70	1197	1131	1127	1127
query71	438	253	256	253
query72	6684	2472	3698	2472
query73	744	352	339	339
query74	10375	8887	8905	8887
query75	3394	2624	2643	2624
query76	3202	1026	1098	1026
query77	384	257	256	256
query78	10520	9650	9347	9347
query79	8791	584	589	584
query80	2040	423	401	401
query81	567	243	243	243
query82	1451	114	114	114
query83	254	135	135	135
query84	285	77	72	72
query85	2379	292	281	281
query86	503	299	296	296
query87	4423	4237	4298	4237
query88	5712	2373	2384	2373
query89	550	294	293	293
query90	2204	175	181	175
query91	175	138	144	138
query92	63	48	46	46
query93	6347	543	538	538
query94	994	270	286	270
query95	343	240	241	240
query96	632	280	277	277
query97	3311	3138	3149	3138
query98	213	201	202	201
query99	1944	1312	1314	1312
Total cold run time: 320872 ms
Total hot run time: 187224 ms

@morningman morningman merged commit 47fbbfa into branch-3.0 Dec 2, 2024
18 of 21 checks passed
@github-actions github-actions bot deleted the auto-pick-44470-branch-3.0 branch December 2, 2024 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants