Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](orc-reader)Turn on late materialization of orc complex types. #44514

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kaka11chen
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaka11chen
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.31% (9980/26050)
Line Coverage: 29.44% (83554/283823)
Region Coverage: 28.60% (42993/150330)
Branch Coverage: 25.17% (21835/86734)
Coverage Report: http://coverage.selectdb-in.cc/coverage/6111a5ea834ccf7604739abc0a64be89959ac102_6111a5ea834ccf7604739abc0a64be89959ac102/report/index.html

@kaka11chen kaka11chen force-pushed the orc_complex_type_late_mat branch from 6111a5e to 681da76 Compare November 25, 2024 13:33
@kaka11chen
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.33% (9977/26032)
Line Coverage: 29.43% (83514/283761)
Region Coverage: 28.59% (42972/150288)
Branch Coverage: 25.17% (21828/86712)
Coverage Report: http://coverage.selectdb-in.cc/coverage/681da7631bd8a59e147e6c783b577bf53c98f299_681da7631bd8a59e147e6c783b577bf53c98f299/report/index.html

@kaka11chen kaka11chen force-pushed the orc_complex_type_late_mat branch from 681da76 to e27d61b Compare November 25, 2024 14:21
@kaka11chen
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.34% (9980/26032)
Line Coverage: 29.43% (83513/283761)
Region Coverage: 28.61% (42994/150288)
Branch Coverage: 25.19% (21842/86712)
Coverage Report: http://coverage.selectdb-in.cc/coverage/e27d61b29c9c8fc2730e3204dd45da8899b67fdc_e27d61b29c9c8fc2730e3204dd45da8899b67fdc/report/index.html

@kaka11chen kaka11chen force-pushed the orc_complex_type_late_mat branch from e27d61b to fef13f6 Compare November 27, 2024 14:08
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the orc_complex_type_late_mat branch from fef13f6 to fd84533 Compare November 27, 2024 15:11
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.34% (9975/26017)
Line Coverage: 29.43% (83502/283748)
Region Coverage: 28.58% (42986/150423)
Branch Coverage: 25.19% (21833/86672)
Coverage Report: http://coverage.selectdb-in.cc/coverage/fd845335b244beb87458b79e220c01eef3ec4f39_fd845335b244beb87458b79e220c01eef3ec4f39/report/index.html

@kaka11chen kaka11chen force-pushed the orc_complex_type_late_mat branch from fd84533 to a02991c Compare December 30, 2024 08:35
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32748 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a02991c9aedf7a75ca659a586517d5673b6a153e, data reload: false

------ Round 1 ----------------------------------
q1	17580	6207	6059	6059
q2	2046	312	182	182
q3	10503	1290	779	779
q4	10206	868	430	430
q5	7516	2209	2020	2020
q6	206	182	148	148
q7	887	737	604	604
q8	9257	1383	1157	1157
q9	5325	4929	4972	4929
q10	6751	2324	1866	1866
q11	504	281	257	257
q12	347	351	217	217
q13	17774	3649	2960	2960
q14	239	227	222	222
q15	558	498	491	491
q16	639	632	601	601
q17	583	856	325	325
q18	7157	6525	6517	6517
q19	2174	977	566	566
q20	297	319	184	184
q21	2887	2285	1925	1925
q22	356	346	309	309
Total cold run time: 103792 ms
Total hot run time: 32748 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6359	6295	6270	6270
q2	240	335	243	243
q3	2264	2630	2353	2353
q4	1395	1856	1356	1356
q5	4338	4787	4824	4787
q6	190	178	145	145
q7	2055	1972	1843	1843
q8	2705	2856	2754	2754
q9	7303	7314	7324	7314
q10	3087	3350	2791	2791
q11	574	513	495	495
q12	667	742	607	607
q13	3431	3822	3074	3074
q14	290	301	278	278
q15	565	504	522	504
q16	647	689	654	654
q17	1231	1808	1277	1277
q18	7815	7557	7273	7273
q19	892	1308	1090	1090
q20	2035	2023	1906	1906
q21	5876	5383	5035	5035
q22	637	604	592	592
Total cold run time: 54596 ms
Total hot run time: 52641 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196594 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a02991c9aedf7a75ca659a586517d5673b6a153e, data reload: false

query1	1315	988	941	941
query2	6491	2434	2244	2244
query3	11104	4825	4876	4825
query4	32891	23408	23309	23309
query5	4170	619	451	451
query6	268	195	186	186
query7	3987	479	307	307
query8	291	233	230	230
query9	9249	2747	2729	2729
query10	454	303	263	263
query11	17868	15508	15098	15098
query12	146	105	106	105
query13	1578	549	397	397
query14	9107	6983	7766	6983
query15	241	217	203	203
query16	8164	618	451	451
query17	1544	816	618	618
query18	2115	441	329	329
query19	215	198	172	172
query20	131	120	123	120
query21	208	128	108	108
query22	4770	4626	4460	4460
query23	34646	34404	33469	33469
query24	6505	2312	2306	2306
query25	466	454	379	379
query26	786	262	150	150
query27	2225	461	327	327
query28	5582	2520	2512	2512
query29	552	566	460	460
query30	214	184	151	151
query31	1005	931	865	865
query32	77	59	56	56
query33	486	351	318	318
query34	760	856	542	542
query35	816	840	776	776
query36	1050	1077	990	990
query37	135	104	76	76
query38	4238	4132	4099	4099
query39	1513	1481	1463	1463
query40	204	116	101	101
query41	48	49	46	46
query42	120	106	110	106
query43	501	545	498	498
query44	1371	832	835	832
query45	191	189	170	170
query46	895	1054	671	671
query47	1991	2037	1975	1975
query48	387	414	319	319
query49	715	477	377	377
query50	670	693	396	396
query51	7274	7290	7175	7175
query52	103	105	96	96
query53	224	258	199	199
query54	476	509	430	430
query55	80	81	78	78
query56	262	256	241	241
query57	1230	1218	1169	1169
query58	254	226	219	219
query59	3346	3257	3120	3120
query60	280	274	243	243
query61	113	106	111	106
query62	868	795	755	755
query63	264	189	200	189
query64	3136	1027	670	670
query65	3331	3278	3234	3234
query66	777	414	324	324
query67	16662	15807	15610	15610
query68	9815	749	519	519
query69	493	297	251	251
query70	1204	1141	1112	1112
query71	434	283	309	283
query72	5844	3875	3837	3837
query73	799	753	363	363
query74	10028	9236	9102	9102
query75	4576	3162	2635	2635
query76	5522	1183	779	779
query77	1016	418	277	277
query78	10142	10281	9470	9470
query79	2880	925	579	579
query80	725	503	480	480
query81	500	266	234	234
query82	354	153	126	126
query83	190	156	157	156
query84	285	89	77	77
query85	732	350	310	310
query86	349	312	304	304
query87	4607	4578	4366	4366
query88	3308	2260	2221	2221
query89	411	343	306	306
query90	2018	187	185	185
query91	135	132	106	106
query92	70	54	52	52
query93	1661	897	537	537
query94	671	401	258	258
query95	331	283	256	256
query96	488	612	281	281
query97	2765	2975	2668	2668
query98	220	203	195	195
query99	1844	1553	1452	1452
Total cold run time: 297553 ms
Total hot run time: 196594 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.72 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a02991c9aedf7a75ca659a586517d5673b6a153e, data reload: false

query1	0.03	0.04	0.04
query2	0.07	0.03	0.03
query3	0.24	0.06	0.07
query4	1.62	0.10	0.11
query5	0.42	0.40	0.42
query6	1.15	0.65	0.65
query7	0.02	0.01	0.02
query8	0.04	0.03	0.03
query9	0.60	0.50	0.49
query10	0.54	0.56	0.56
query11	0.15	0.11	0.11
query12	0.14	0.11	0.11
query13	0.60	0.62	0.60
query14	2.69	2.74	2.72
query15	0.89	0.82	0.82
query16	0.37	0.40	0.40
query17	1.04	1.07	1.06
query18	0.23	0.21	0.20
query19	1.94	1.81	2.02
query20	0.01	0.01	0.02
query21	15.35	0.93	0.57
query22	0.76	0.83	0.65
query23	15.27	1.44	0.56
query24	2.62	1.43	1.41
query25	0.18	0.10	0.22
query26	0.25	0.15	0.13
query27	0.07	0.06	0.05
query28	14.38	1.52	1.04
query29	12.57	4.00	3.27
query30	0.24	0.09	0.06
query31	2.82	0.60	0.37
query32	3.23	0.55	0.46
query33	3.02	3.17	3.03
query34	16.84	5.12	4.51
query35	4.54	4.50	4.54
query36	0.63	0.52	0.48
query37	0.10	0.07	0.07
query38	0.04	0.04	0.03
query39	0.04	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 106.06 s
Total hot run time: 31.72 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.90% (10129/26039)
Line Coverage: 29.89% (85560/286277)
Region Coverage: 29.03% (43744/150708)
Branch Coverage: 25.56% (22311/87304)
Coverage Report: http://coverage.selectdb-in.cc/coverage/a02991c9aedf7a75ca659a586517d5673b6a153e_a02991c9aedf7a75ca659a586517d5673b6a153e/report/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants