Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [fix](hudi) upgrade hudi to 0.15.0 #44267 #44961

Open
wants to merge 1 commit into
base: branch-3.0
Choose a base branch
from

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Dec 4, 2024

Cherry-picked from #44267

### What problem does this PR solve?

1. upgrade hudi to 0.15.0.
2. impl new hudi jni reader based on hudi-hadoop-mr 
3. add session variable `hudi_jni_scanner` to choose which hudi jni
reader to use, "hadoop" means HadoopHudiJniReader, "spark" means old
HudiJniReader, default value is "hadoop"
4. support session variable `force_jni_scanner` for hudi
5. add more cases for hudi p2

### Release note
[opt](hudi) upgrade hudi to 0.15 and support hadoop jni reader
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Dec 4, 2024
@dataroaring dataroaring reopened this Dec 4, 2024
@doris-robot
Copy link

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40830 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ef46681c73accc2fd2f818fb4588b84fd91943c0, data reload: false

------ Round 1 ----------------------------------
q1	17578	7622	7369	7369
q2	2064	167	174	167
q3	10697	1091	1181	1091
q4	10559	782	719	719
q5	7747	2843	2831	2831
q6	238	147	145	145
q7	973	620	605	605
q8	9601	2005	2056	2005
q9	8132	6422	6390	6390
q10	7012	2272	2315	2272
q11	456	261	259	259
q12	404	213	214	213
q13	17780	2974	2993	2974
q14	243	206	219	206
q15	562	526	519	519
q16	697	619	583	583
q17	1007	607	565	565
q18	7267	6452	6553	6452
q19	1929	1101	1082	1082
q20	487	197	197	197
q21	3973	3246	3198	3198
q22	1063	996	988	988
Total cold run time: 110469 ms
Total hot run time: 40830 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7460	7256	7248	7248
q2	324	223	226	223
q3	3058	2926	2929	2926
q4	2107	1779	1805	1779
q5	5719	5753	5751	5751
q6	225	149	144	144
q7	2234	1780	1790	1780
q8	3400	3547	3454	3454
q9	8979	8941	8870	8870
q10	3567	3536	3550	3536
q11	616	507	488	488
q12	834	582	595	582
q13	16800	3183	3169	3169
q14	297	272	275	272
q15	569	506	520	506
q16	719	662	672	662
q17	1863	1645	1672	1645
q18	8166	7960	7630	7630
q19	2334	1603	1499	1499
q20	2064	1870	1845	1845
q21	5558	5270	5380	5270
q22	1113	1037	1032	1032
Total cold run time: 78006 ms
Total hot run time: 60311 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 195593 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ef46681c73accc2fd2f818fb4588b84fd91943c0, data reload: false

query1	1225	939	918	918
query2	6284	2146	2021	2021
query3	10921	4420	4192	4192
query4	67348	29087	23463	23463
query5	5238	456	455	455
query6	446	173	179	173
query7	5675	309	316	309
query8	324	232	234	232
query9	9280	2711	2681	2681
query10	498	265	257	257
query11	17551	15247	15613	15247
query12	153	105	106	105
query13	1541	423	412	412
query14	10643	7632	6964	6964
query15	225	187	185	185
query16	6931	484	488	484
query17	1270	582	598	582
query18	1766	339	316	316
query19	250	155	153	153
query20	109	112	111	111
query21	213	101	103	101
query22	4451	4590	4364	4364
query23	34608	34022	34356	34022
query24	6135	2972	2864	2864
query25	520	397	403	397
query26	675	171	171	171
query27	1786	295	309	295
query28	4453	2578	2523	2523
query29	669	443	433	433
query30	245	178	162	162
query31	1054	811	833	811
query32	62	54	54	54
query33	409	285	291	285
query34	909	519	513	513
query35	875	719	725	719
query36	1096	955	978	955
query37	114	68	72	68
query38	4102	4026	4021	4021
query39	1517	1481	1459	1459
query40	210	104	103	103
query41	51	54	49	49
query42	109	98	99	98
query43	539	499	504	499
query44	1223	804	832	804
query45	182	173	169	169
query46	1165	727	720	720
query47	1951	1878	1901	1878
query48	467	373	391	373
query49	744	402	410	402
query50	855	432	434	432
query51	7341	7121	6865	6865
query52	99	90	91	90
query53	267	189	194	189
query54	581	467	459	459
query55	78	77	77	77
query56	261	250	248	248
query57	1179	1062	1087	1062
query58	222	210	208	208
query59	3134	2796	2887	2796
query60	280	261	261	261
query61	147	128	130	128
query62	766	662	662	662
query63	219	191	194	191
query64	1583	774	635	635
query65	3287	3173	3197	3173
query66	712	296	304	296
query67	15772	15495	15219	15219
query68	4846	538	548	538
query69	427	260	262	260
query70	1160	1139	1125	1125
query71	439	245	252	245
query72	6469	4086	3921	3921
query73	759	338	342	338
query74	10159	9022	8914	8914
query75	3353	2650	2633	2633
query76	2205	1039	1012	1012
query77	483	269	271	269
query78	10702	9840	9398	9398
query79	10242	593	600	593
query80	2209	432	422	422
query81	569	243	236	236
query82	1452	117	111	111
query83	310	159	138	138
query84	284	81	77	77
query85	1325	304	291	291
query86	488	252	304	252
query87	4380	4274	4199	4199
query88	5876	2422	2441	2422
query89	536	289	287	287
query90	2112	183	186	183
query91	179	146	147	146
query92	70	46	48	46
query93	7108	539	544	539
query94	969	291	297	291
query95	362	248	245	245
query96	635	292	286	286
query97	3362	3158	3126	3126
query98	222	201	200	200
query99	1659	1300	1287	1287
Total cold run time: 340346 ms
Total hot run time: 195593 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants