Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature](orc-reader) Implement new merge io facility for orc reader. #45966

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kaka11chen
Copy link
Contributor

@kaka11chen kaka11chen commented Dec 25, 2024

What problem does this PR solve?

Problem Summary:

The original merge io mechanism MergeRangeFileReader requires that the range must be read in order, and the ranges can be out of order, so the range cannot be read back.
And if you turn on delayed materialization of orc complex types, you will need to present a stream readback scenario, such as select struct_element(info, 'age'), id from test_orc_struct, where struct_element(info, 'name') = 'Alice'.
When late materialization is turned on, the present stream of the parent node info will be read first after name is read. When reading age, the parent node info needs to be read back. So the late materialization of the orc complex type cannot be turned on at present.

Release note

The new merge io mechanism classifies the ranges read by the stream of orc stripe into small ranges and large ranges according to the orc_once_max_read_bytes size. The ranges smaller than the orc_once_max_read_bytes size are divided into small ranges, and the ranges exceeding the orc_once_max_read_bytes size are divided into large ranges.
Finally, the merging of adjacent intervals for small ranges is established. The maximum merging length is orc_once_max_read_bytes, and the maximum merging distance allowed between intervals is orc_max_merge_distance_bytes. The merged range is established through a cache of the merged range to a reader in memory, and a corresponding inputstream is builded for the lower layer orc reader to read. Large ranges are read directly through the underlying file reader. The current implementation is able to read arbitrarily in the merged range.

Future Work

Currently, implementations like OrcMergeRangeFileReader and RangeCacheFileReader must finally use memcpy from the cache to the result slice due to the limitations of the FileReader interface. But in theory, it is possible not to do memcpy, but to directly point to the cache location to represent the slice. This can be reconstructed and optimized in the future.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Dec 25, 2024

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the new_merge_io_for_orc_reader branch from 95af48c to 772ffb6 Compare December 25, 2024 15:54
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the new_merge_io_for_orc_reader branch from 772ffb6 to 7df1d9d Compare December 25, 2024 17:34
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the new_merge_io_for_orc_reader branch from 7df1d9d to 2fecd9c Compare December 25, 2024 18:12
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the new_merge_io_for_orc_reader branch from 2fecd9c to ee35b47 Compare December 26, 2024 01:21
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the new_merge_io_for_orc_reader branch from ee35b47 to 5b1e090 Compare December 27, 2024 08:55
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32432 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5b1e0902555b2fac4d90160d04ecd0671bd2b6ad, data reload: false

------ Round 1 ----------------------------------
q1	17658	6117	6041	6041
q2	2055	308	162	162
q3	10498	1276	706	706
q4	10240	876	427	427
q5	7920	2244	1976	1976
q6	202	183	150	150
q7	912	730	620	620
q8	9252	1373	1172	1172
q9	5293	4896	4992	4896
q10	6747	2339	1865	1865
q11	473	277	241	241
q12	347	359	219	219
q13	17788	3658	2956	2956
q14	226	238	226	226
q15	563	491	505	491
q16	633	632	591	591
q17	575	860	322	322
q18	7258	6484	6358	6358
q19	2198	969	574	574
q20	301	311	182	182
q21	2836	2154	1946	1946
q22	365	325	311	311
Total cold run time: 104340 ms
Total hot run time: 32432 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6319	6206	6230	6206
q2	240	328	236	236
q3	2230	2650	2325	2325
q4	1421	1854	1343	1343
q5	4370	4781	4960	4781
q6	182	175	141	141
q7	2133	2008	1788	1788
q8	2624	2803	2674	2674
q9	7485	7339	7238	7238
q10	3046	3384	2812	2812
q11	580	539	494	494
q12	671	753	616	616
q13	3332	3762	3070	3070
q14	288	325	291	291
q15	571	514	507	507
q16	662	694	666	666
q17	1210	1712	1245	1245
q18	7730	7350	6953	6953
q19	787	997	1099	997
q20	1979	2032	1789	1789
q21	5502	5005	4771	4771
q22	594	631	546	546
Total cold run time: 53956 ms
Total hot run time: 51489 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190964 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5b1e0902555b2fac4d90160d04ecd0671bd2b6ad, data reload: false

query1	998	416	383	383
query2	6513	2475	2468	2468
query3	6712	228	219	219
query4	33586	23619	23448	23448
query5	4326	623	460	460
query6	298	210	197	197
query7	4614	493	323	323
query8	307	253	235	235
query9	9697	2751	2742	2742
query10	481	312	267	267
query11	18287	15459	15137	15137
query12	165	107	114	107
query13	1685	536	425	425
query14	11039	6770	7251	6770
query15	249	188	187	187
query16	8090	578	469	469
query17	1552	743	550	550
query18	2135	390	290	290
query19	204	188	148	148
query20	120	111	106	106
query21	203	120	101	101
query22	4329	4509	4321	4321
query23	34489	33612	33710	33612
query24	6466	2254	2210	2210
query25	502	444	384	384
query26	1188	266	155	155
query27	2037	463	342	342
query28	5353	2452	2424	2424
query29	751	550	425	425
query30	225	180	153	153
query31	980	924	798	798
query32	99	61	59	59
query33	498	354	288	288
query34	774	843	511	511
query35	815	801	745	745
query36	1018	1052	947	947
query37	121	96	78	78
query38	4126	4137	4155	4137
query39	1501	1461	1389	1389
query40	212	118	103	103
query41	52	47	51	47
query42	120	108	105	105
query43	523	534	499	499
query44	1337	815	831	815
query45	188	181	172	172
query46	895	1041	695	695
query47	1946	1955	1860	1860
query48	385	409	322	322
query49	764	464	382	382
query50	621	656	390	390
query51	7145	7100	6960	6960
query52	108	102	93	93
query53	232	258	182	182
query54	473	493	399	399
query55	81	78	82	78
query56	259	252	236	236
query57	1225	1186	1139	1139
query58	228	218	226	218
query59	3164	3184	3082	3082
query60	271	271	243	243
query61	109	108	111	108
query62	867	810	744	744
query63	273	192	190	190
query64	4584	982	651	651
query65	3259	3242	3270	3242
query66	1057	408	315	315
query67	15958	15863	15541	15541
query68	8984	774	509	509
query69	468	280	245	245
query70	1226	1106	1155	1106
query71	441	282	249	249
query72	5785	3849	3825	3825
query73	658	754	361	361
query74	9898	9079	9104	9079
query75	4566	3166	2638	2638
query76	4182	1166	809	809
query77	807	372	296	296
query78	9976	10089	9612	9612
query79	3547	907	596	596
query80	720	525	438	438
query81	469	265	243	243
query82	609	153	123	123
query83	202	169	146	146
query84	282	94	77	77
query85	844	370	319	319
query86	353	331	299	299
query87	4628	4635	4653	4635
query88	4433	2216	2194	2194
query89	412	327	299	299
query90	1896	194	194	194
query91	138	139	114	114
query92	65	60	51	51
query93	995	882	524	524
query94	741	386	292	292
query95	333	271	258	258
query96	483	615	289	289
query97	2752	2824	2696	2696
query98	234	203	190	190
query99	1721	1582	1437	1437
Total cold run time: 295717 ms
Total hot run time: 190964 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.76 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5b1e0902555b2fac4d90160d04ecd0671bd2b6ad, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.03
query3	0.24	0.07	0.07
query4	1.60	0.11	0.12
query5	0.43	0.42	0.39
query6	1.19	0.65	0.66
query7	0.02	0.01	0.01
query8	0.04	0.02	0.03
query9	0.58	0.50	0.50
query10	0.55	0.57	0.55
query11	0.15	0.09	0.10
query12	0.13	0.11	0.12
query13	0.61	0.61	0.59
query14	2.71	2.79	2.74
query15	0.89	0.82	0.84
query16	0.38	0.39	0.38
query17	1.11	1.04	1.03
query18	0.22	0.20	0.20
query19	1.89	1.78	1.99
query20	0.01	0.01	0.02
query21	15.35	0.94	0.58
query22	0.76	0.72	0.71
query23	15.31	1.40	0.57
query24	2.69	0.63	1.73
query25	0.23	0.15	0.08
query26	0.24	0.14	0.12
query27	0.05	0.04	0.05
query28	14.24	1.60	1.05
query29	12.57	3.91	3.21
query30	0.25	0.09	0.06
query31	2.83	0.61	0.38
query32	3.22	0.53	0.47
query33	3.19	3.01	3.08
query34	16.59	5.05	4.48
query35	4.45	4.47	4.46
query36	0.63	0.51	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.16	0.14	0.14
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.03	0.02	0.03
Total cold run time: 105.93 s
Total hot run time: 30.76 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.88% (10125/26044)
Line Coverage: 29.88% (85541/286297)
Region Coverage: 29.02% (43719/150669)
Branch Coverage: 25.55% (22296/87270)
Coverage Report: http://coverage.selectdb-in.cc/coverage/5b1e0902555b2fac4d90160d04ecd0671bd2b6ad_5b1e0902555b2fac4d90160d04ecd0671bd2b6ad/report/index.html

@morningman
Copy link
Contributor

This pull request introduces a new OrcMergeRangeFileReader class and enhances the ORC file reading process with improved profiling and optimized I/O operations. The most important changes include adding new classes and methods, updating existing methods for better performance, and incorporating new profiling capabilities.

Enhancements to ORC file reading:

Updates to ORC reader implementation:

Profiling improvements:

These changes aim to optimize the ORC file reading process by merging small I/O operations, improving profiling, and handling large I/O operations more efficiently.

@kaka11chen kaka11chen force-pushed the new_merge_io_for_orc_reader branch from 5b1e090 to d9c405d Compare January 10, 2025 01:56
@kaka11chen kaka11chen marked this pull request as ready for review January 10, 2025 01:56
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33560 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d9c405db1cbb8caea5f8ab3c683e1af67d2d55a8, data reload: false

------ Round 1 ----------------------------------
q1	17629	6335	6279	6279
q2	2070	320	179	179
q3	10488	1312	775	775
q4	10218	924	460	460
q5	7703	2308	2087	2087
q6	218	187	150	150
q7	937	760	611	611
q8	9240	1510	1293	1293
q9	5401	5069	5028	5028
q10	6814	2322	1875	1875
q11	526	297	268	268
q12	362	388	232	232
q13	17758	3731	3094	3094
q14	250	246	220	220
q15	574	518	498	498
q16	644	613	575	575
q17	609	902	347	347
q18	7229	6459	6511	6459
q19	2474	1053	565	565
q20	308	335	203	203
q21	3049	2243	2045	2045
q22	369	346	317	317
Total cold run time: 104870 ms
Total hot run time: 33560 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6588	6508	6549	6508
q2	265	344	237	237
q3	2308	2728	2316	2316
q4	1471	1868	1367	1367
q5	4413	4901	4971	4901
q6	212	182	148	148
q7	2195	1985	1848	1848
q8	2725	2967	2844	2844
q9	7357	7262	7242	7242
q10	3042	3273	2922	2922
q11	611	521	526	521
q12	701	753	630	630
q13	3553	3855	3221	3221
q14	292	306	271	271
q15	577	517	517	517
q16	688	683	661	661
q17	1283	1801	1299	1299
q18	7773	7653	7430	7430
q19	855	1159	1309	1159
q20	2037	2041	1933	1933
q21	5822	5174	5192	5174
q22	639	629	603	603
Total cold run time: 55407 ms
Total hot run time: 53752 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194809 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d9c405db1cbb8caea5f8ab3c683e1af67d2d55a8, data reload: false

query1	1320	971	925	925
query2	6338	2331	2332	2331
query3	10994	4573	4657	4573
query4	32968	23904	23160	23160
query5	4586	608	455	455
query6	303	202	193	193
query7	3983	489	304	304
query8	291	243	234	234
query9	9435	2710	2698	2698
query10	481	324	245	245
query11	17857	15185	15121	15121
query12	151	103	102	102
query13	1528	517	421	421
query14	10767	7334	7109	7109
query15	245	199	193	193
query16	8032	644	447	447
query17	1585	768	610	610
query18	2093	424	333	333
query19	208	206	180	180
query20	130	119	123	119
query21	209	128	105	105
query22	4504	4443	4246	4246
query23	34018	33359	33261	33261
query24	6450	2312	2306	2306
query25	489	457	405	405
query26	773	278	155	155
query27	2120	460	332	332
query28	5893	2487	2449	2449
query29	619	563	425	425
query30	210	182	162	162
query31	955	875	769	769
query32	72	61	57	57
query33	481	356	331	331
query34	751	855	522	522
query35	793	796	767	767
query36	1033	1043	959	959
query37	127	106	78	78
query38	4018	4261	4343	4261
query39	1522	1442	1481	1442
query40	204	121	106	106
query41	53	54	47	47
query42	132	107	103	103
query43	536	546	486	486
query44	1371	852	850	850
query45	188	171	161	161
query46	891	1051	666	666
query47	1897	1876	1868	1868
query48	385	416	333	333
query49	733	494	400	400
query50	645	651	386	386
query51	7140	6980	6956	6956
query52	104	99	91	91
query53	228	260	179	179
query54	485	510	423	423
query55	96	84	81	81
query56	258	251	248	248
query57	1212	1177	1160	1160
query58	236	242	219	219
query59	3156	3397	3225	3225
query60	285	296	266	266
query61	123	110	113	110
query62	859	793	719	719
query63	226	190	186	186
query64	3437	1014	702	702
query65	3419	3242	3198	3198
query66	782	403	306	306
query67	15955	15719	15391	15391
query68	7785	708	529	529
query69	496	295	259	259
query70	1212	1144	1088	1088
query71	444	303	253	253
query72	6516	3831	3876	3831
query73	658	745	358	358
query74	10442	9079	8560	8560
query75	4088	3150	2667	2667
query76	3698	1171	769	769
query77	773	366	281	281
query78	10088	9996	9356	9356
query79	3776	791	584	584
query80	716	523	448	448
query81	508	264	225	225
query82	645	154	118	118
query83	170	176	146	146
query84	252	93	80	80
query85	795	369	384	369
query86	401	303	302	302
query87	4495	4513	4426	4426
query88	4764	2157	2174	2157
query89	426	330	288	288
query90	1800	191	186	186
query91	134	134	169	134
query92	63	56	52	52
query93	2490	867	521	521
query94	674	397	281	281
query95	335	266	250	250
query96	500	606	281	281
query97	2851	2931	2770	2770
query98	223	199	192	192
query99	1476	1509	1397	1397
Total cold run time: 297062 ms
Total hot run time: 194809 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 39.34% (10259/26080)
Line Coverage: 30.52% (87450/286573)
Region Coverage: 29.56% (44564/150779)
Branch Coverage: 26.11% (22811/87374)
Coverage Report: http://coverage.selectdb-in.cc/coverage/d9c405db1cbb8caea5f8ab3c683e1af67d2d55a8_d9c405db1cbb8caea5f8ab3c683e1af67d2d55a8/report/index.html

@doris-robot
Copy link

ClickBench: Total hot run time: 30.93 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d9c405db1cbb8caea5f8ab3c683e1af67d2d55a8, data reload: false

query1	0.03	0.03	0.04
query2	0.08	0.03	0.04
query3	0.23	0.06	0.07
query4	1.63	0.11	0.10
query5	0.42	0.42	0.42
query6	1.15	0.66	0.66
query7	0.03	0.02	0.01
query8	0.05	0.03	0.03
query9	0.59	0.49	0.51
query10	0.56	0.56	0.55
query11	0.15	0.11	0.10
query12	0.14	0.11	0.11
query13	0.60	0.61	0.61
query14	2.86	2.72	2.73
query15	0.90	0.82	0.83
query16	0.39	0.39	0.38
query17	0.99	1.03	1.04
query18	0.23	0.20	0.21
query19	1.97	1.95	1.88
query20	0.01	0.01	0.01
query21	15.39	0.90	0.59
query22	0.77	0.74	0.69
query23	15.29	1.45	0.56
query24	3.36	1.65	0.56
query25	0.25	0.08	0.05
query26	0.25	0.15	0.13
query27	0.06	0.04	0.07
query28	13.57	1.50	1.04
query29	12.61	3.98	3.26
query30	0.25	0.09	0.07
query31	2.85	0.61	0.39
query32	3.23	0.55	0.47
query33	3.17	3.08	3.10
query34	16.92	5.07	4.48
query35	4.41	4.49	4.48
query36	0.64	0.49	0.49
query37	0.09	0.07	0.06
query38	0.04	0.04	0.04
query39	0.04	0.03	0.02
query40	0.16	0.13	0.12
query41	0.08	0.04	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.52 s
Total hot run time: 30.93 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants