Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [feat](metrics) Unify metrics of thread pool #43144 #46239

Merged

Conversation

zhiqiang-hhhh
Copy link
Contributor

cherry pick from #43144

Add metrics for all thread pool, more specifically, for all ThreadPool
objects.
All thread pool will have following metrics:
1. thread_pool_active_threads
2. thread_pool_queue_size
3. thread_pool_max_queue_size
4. thread_pool_max_threads
5. task_execution_time_ns_avg_in_last_1000_times
6. task_wait_worker_ns_avg_in_last_1000_times

A new class `IntervalHistogramStat` is created for interval histogram
calculation.

Metrics is updated by `hook` method when they are needed by prometheus.

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@Thearas
Copy link
Contributor

Thearas commented Jan 1, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40435 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6406df878bc41c14878d9b832864e8665340e9ba, data reload: false

------ Round 1 ----------------------------------
q1	17560	7430	7247	7247
q2	2054	173	162	162
q3	10645	1047	1173	1047
q4	10556	728	746	728
q5	7735	2818	2763	2763
q6	240	148	142	142
q7	981	598	596	596
q8	9384	1904	2011	1904
q9	6589	6419	6405	6405
q10	7074	2389	2357	2357
q11	471	263	270	263
q12	407	218	213	213
q13	17910	3041	2998	2998
q14	238	222	203	203
q15	569	527	528	527
q16	697	602	634	602
q17	974	534	543	534
q18	7127	6749	6522	6522
q19	1392	1084	967	967
q20	470	208	195	195
q21	4023	3105	3075	3075
q22	1096	988	985	985
Total cold run time: 108192 ms
Total hot run time: 40435 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7254	7205	7228	7205
q2	328	242	233	233
q3	2873	2946	2920	2920
q4	2024	1852	1827	1827
q5	5659	5677	5717	5677
q6	225	140	139	139
q7	2231	1805	1826	1805
q8	3308	3548	3511	3511
q9	8736	8809	8837	8809
q10	3585	3546	3530	3530
q11	614	513	490	490
q12	812	606	637	606
q13	11060	3171	3160	3160
q14	296	279	267	267
q15	577	518	536	518
q16	732	670	674	670
q17	1861	1629	1585	1585
q18	8133	7726	7642	7642
q19	1668	1616	1537	1537
q20	2096	1870	1881	1870
q21	5605	5486	5413	5413
q22	1115	1037	1037	1037
Total cold run time: 70792 ms
Total hot run time: 60451 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197854 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6406df878bc41c14878d9b832864e8665340e9ba, data reload: false

query1	1262	914	908	908
query2	6267	2159	2097	2097
query3	10838	3979	4347	3979
query4	66274	29035	23594	23594
query5	4985	454	455	454
query6	419	178	181	178
query7	5564	322	312	312
query8	324	236	227	227
query9	8725	2686	2684	2684
query10	458	284	272	272
query11	17192	15163	15902	15163
query12	155	100	103	100
query13	1466	473	417	417
query14	10466	7710	7697	7697
query15	208	181	197	181
query16	7129	488	495	488
query17	1273	583	596	583
query18	1830	340	336	336
query19	212	158	163	158
query20	122	113	114	113
query21	64	47	47	47
query22	4743	4790	4543	4543
query23	34796	34111	34239	34111
query24	6729	2895	2923	2895
query25	517	412	424	412
query26	659	169	167	167
query27	1937	307	307	307
query28	4397	2507	2488	2488
query29	713	459	424	424
query30	262	163	164	163
query31	970	847	861	847
query32	69	53	59	53
query33	432	287	291	287
query34	927	499	517	499
query35	858	756	745	745
query36	1086	967	985	967
query37	125	74	76	74
query38	4161	3980	4006	3980
query39	1531	1476	1459	1459
query40	137	81	80	80
query41	51	48	47	47
query42	116	101	101	101
query43	549	512	513	512
query44	1214	840	824	824
query45	185	167	176	167
query46	1156	724	744	724
query47	2021	1938	1903	1903
query48	471	395	375	375
query49	745	383	373	373
query50	853	431	427	427
query51	7285	7257	7264	7257
query52	93	86	86	86
query53	260	187	183	183
query54	548	444	433	433
query55	72	71	75	71
query56	253	249	224	224
query57	1236	1131	1064	1064
query58	205	205	206	205
query59	3243	3148	3024	3024
query60	274	254	253	253
query61	106	108	109	108
query62	799	659	670	659
query63	213	185	192	185
query64	1379	668	629	629
query65	3241	3182	3176	3176
query66	712	300	292	292
query67	15881	15593	15675	15593
query68	3981	580	572	572
query69	435	269	261	261
query70	1187	1121	1091	1091
query71	362	254	251	251
query72	6489	3939	4067	3939
query73	744	345	348	345
query74	10181	8987	9026	8987
query75	3357	2652	2683	2652
query76	1854	1094	1007	1007
query77	494	269	274	269
query78	10536	9671	9549	9549
query79	1704	596	592	592
query80	1449	432	419	419
query81	525	243	237	237
query82	1263	111	116	111
query83	165	173	143	143
query84	285	80	80	80
query85	971	302	288	288
query86	408	308	293	293
query87	4495	4323	4330	4323
query88	3717	2407	2347	2347
query89	412	288	293	288
query90	1981	184	183	183
query91	184	148	142	142
query92	65	50	52	50
query93	2047	543	539	539
query94	858	274	280	274
query95	357	264	256	256
query96	612	284	280	280
query97	3341	3254	3211	3211
query98	209	203	201	201
query99	1604	1290	1309	1290
Total cold run time: 319715 ms
Total hot run time: 197854 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.22 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6406df878bc41c14878d9b832864e8665340e9ba, data reload: false

query1	0.03	0.03	0.04
query2	0.07	0.04	0.02
query3	0.24	0.07	0.06
query4	1.61	0.11	0.10
query5	0.53	0.49	0.52
query6	1.13	0.73	0.72
query7	0.02	0.01	0.01
query8	0.04	0.05	0.03
query9	0.56	0.50	0.49
query10	0.54	0.54	0.54
query11	0.14	0.10	0.10
query12	0.14	0.11	0.11
query13	0.61	0.60	0.60
query14	2.96	2.90	2.91
query15	0.89	0.81	0.82
query16	0.38	0.38	0.38
query17	1.04	0.98	1.09
query18	0.23	0.21	0.21
query19	1.82	1.80	2.11
query20	0.02	0.00	0.01
query21	15.38	0.59	0.58
query22	2.44	2.55	1.85
query23	16.85	1.16	0.81
query24	3.02	1.66	1.52
query25	0.32	0.12	0.05
query26	0.50	0.15	0.15
query27	0.04	0.04	0.04
query28	9.53	1.13	1.07
query29	12.65	3.23	3.18
query30	0.24	0.06	0.06
query31	2.85	0.38	0.38
query32	3.24	0.46	0.47
query33	2.96	3.00	3.03
query34	16.95	4.43	4.44
query35	4.52	4.48	4.51
query36	0.69	0.49	0.48
query37	0.10	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.03
query40	0.16	0.13	0.12
query41	0.07	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 105.66 s
Total hot run time: 33.22 s

@yiguolei yiguolei merged commit 5c55700 into apache:branch-3.0 Jan 2, 2025
19 of 20 checks passed
@zhiqiang-hhhh zhiqiang-hhhh deleted the pick_43144_to_upstream_branch-3.0 branch January 2, 2025 02:38
@zhiqiang-hhhh zhiqiang-hhhh restored the pick_43144_to_upstream_branch-3.0 branch January 2, 2025 03:46
@zhiqiang-hhhh zhiqiang-hhhh deleted the pick_43144_to_upstream_branch-3.0 branch January 2, 2025 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants