Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](docker)Add a BE ENV item 'SKIP_CHECK_ULIMIT' for Docker to start quickly #45267

Merged
merged 38 commits into from
Dec 16, 2024

Conversation

FreeOnePlus
Copy link
Contributor

@FreeOnePlus FreeOnePlus commented Dec 10, 2024

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

In the storage_engine.cpp of the BE process, it is mandatory to check that the ulimit value must be greater than 60,000. When starting with Docker or Docker-Compose, it is necessary to preemptively change the corresponding value on the host machine. This change is very unfriendly to Docker, as it loses its unique advantage of being able to "build quickly and anywhere," and cannot become a fundamental capability for rapid startup. Therefore, a new environment variable has been added to control whether to skip this check value. The default value is false, which enforces the check of this value. When set to true, it skips the check and starts directly.

Release note

None

Check List (For Author)

  • Test
    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  1. Execute ulimit -n 50000 to make it less than the BE check value of 60000.
  2. Comment out the part that checks the ulimit size in the start_be.sh script.
  3. Add the envexport SKIP_CHECK_ULIMIT = true configuration item.
  4. The BE process can start normally.
  5. Add the envexport SKIP_CHECK_ULIMIT = false configuration item.
  6. The BE cannot start, indicating a problem with the ulimit value being too small.
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@FreeOnePlus FreeOnePlus changed the title [feat](storage)Add a BE configuration item 'skip_check_ulimit' for Docker to start quickly [feat](docker)Add a BE configuration item 'skip_check_ulimit' for Docker to start quickly Dec 10, 2024
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@FreeOnePlus
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40068 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ef686621f9f4363a7377e02332a27869349f3e27, data reload: false

------ Round 1 ----------------------------------
q1	17617	7471	7352	7352
q2	2053	180	167	167
q3	10624	1162	1229	1162
q4	10378	807	715	715
q5	7657	2972	2730	2730
q6	235	149	152	149
q7	1008	626	610	610
q8	9226	1891	1979	1891
q9	6788	6480	6531	6480
q10	7016	2264	2298	2264
q11	468	263	279	263
q12	412	221	225	221
q13	17760	3034	2981	2981
q14	236	212	209	209
q15	571	539	515	515
q16	639	593	584	584
q17	971	508	588	508
q18	7475	6625	6849	6625
q19	1340	1096	937	937
q20	468	186	184	184
q21	4041	3355	3212	3212
q22	388	322	309	309
Total cold run time: 107371 ms
Total hot run time: 40068 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7275	7281	8111	7281
q2	328	235	234	234
q3	2961	2994	3134	2994
q4	2126	1832	1818	1818
q5	5624	5657	5638	5638
q6	218	141	142	141
q7	2213	1834	1882	1834
q8	3364	3562	3542	3542
q9	9093	9014	9036	9014
q10	3596	3545	3585	3545
q11	620	490	488	488
q12	841	657	619	619
q13	10410	3159	3218	3159
q14	312	265	262	262
q15	554	514	505	505
q16	676	644	631	631
q17	1824	1600	1576	1576
q18	7791	7552	7561	7552
q19	1654	1606	1481	1481
q20	2063	1815	1810	1810
q21	5548	5250	5265	5250
q22	666	563	557	557
Total cold run time: 69757 ms
Total hot run time: 59931 ms

@FreeOnePlus
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39724 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cf9f3fd3e02ec0f19a412d07a7d8bbe94bcfe29f, data reload: false

------ Round 1 ----------------------------------
q1	17695	7437	7229	7229
q2	2054	178	171	171
q3	10634	1107	1119	1107
q4	10234	735	781	735
q5	7579	3009	2712	2712
q6	239	153	150	150
q7	981	609	618	609
q8	9243	1837	1941	1837
q9	6657	6485	6382	6382
q10	6968	2296	2323	2296
q11	460	257	261	257
q12	411	226	228	226
q13	17792	2978	2944	2944
q14	233	205	213	205
q15	558	520	521	520
q16	641	570	600	570
q17	974	617	531	531
q18	7385	6622	6580	6580
q19	1345	1070	947	947
q20	489	180	181	180
q21	3974	3218	3408	3218
q22	383	319	318	318
Total cold run time: 106929 ms
Total hot run time: 39724 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7244	7229	7248	7229
q2	340	228	233	228
q3	2877	2798	2780	2780
q4	1998	1741	1712	1712
q5	5343	5393	5411	5393
q6	223	141	143	141
q7	2135	1734	1726	1726
q8	3239	3401	3433	3401
q9	8686	8635	8604	8604
q10	3505	3418	3424	3418
q11	595	494	491	491
q12	761	577	599	577
q13	10593	2999	2986	2986
q14	289	281	257	257
q15	561	508	521	508
q16	677	638	646	638
q17	1791	1580	1548	1548
q18	7882	7383	7448	7383
q19	1690	1492	1525	1492
q20	2030	1814	1813	1813
q21	5453	5267	5233	5233
q22	624	550	538	538
Total cold run time: 68536 ms
Total hot run time: 58096 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.74% (10101/26073)
Line Coverage: 29.68% (84744/285536)
Region Coverage: 28.74% (43488/151332)
Branch Coverage: 25.30% (22095/87334)
Coverage Report: http://coverage.selectdb-in.cc/coverage/cf9f3fd3e02ec0f19a412d07a7d8bbe94bcfe29f_cf9f3fd3e02ec0f19a412d07a7d8bbe94bcfe29f/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 189468 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cf9f3fd3e02ec0f19a412d07a7d8bbe94bcfe29f, data reload: false

query1	961	396	376	376
query2	6529	2073	2117	2073
query3	6713	213	214	213
query4	34384	23353	23317	23317
query5	4365	458	448	448
query6	275	178	200	178
query7	4622	298	307	298
query8	312	231	235	231
query9	9650	2720	2741	2720
query10	452	260	264	260
query11	18057	15020	14959	14959
query12	152	106	101	101
query13	1693	430	423	423
query14	9499	7501	6918	6918
query15	299	184	177	177
query16	7473	471	470	470
query17	1722	589	572	572
query18	1980	300	293	293
query19	373	147	143	143
query20	125	111	109	109
query21	204	100	112	100
query22	4603	4246	4310	4246
query23	34578	33717	33769	33717
query24	11172	2534	2438	2438
query25	685	363	395	363
query26	1648	146	148	146
query27	2732	281	288	281
query28	7969	2426	2421	2421
query29	918	398	398	398
query30	302	149	147	147
query31	1053	804	808	804
query32	93	58	55	55
query33	774	283	282	282
query34	986	504	511	504
query35	892	740	728	728
query36	1096	948	941	941
query37	138	77	72	72
query38	4332	4169	4146	4146
query39	1463	1397	1445	1397
query40	281	118	98	98
query41	49	43	43	43
query42	114	99	97	97
query43	541	507	479	479
query44	1309	828	831	828
query45	177	165	164	164
query46	1162	701	710	701
query47	1924	1821	1826	1821
query48	411	326	325	325
query49	1245	381	385	381
query50	798	374	374	374
query51	7242	7003	7084	7003
query52	100	94	88	88
query53	252	175	183	175
query54	1290	410	411	410
query55	82	79	86	79
query56	273	247	227	227
query57	1256	1140	1099	1099
query58	249	216	217	216
query59	3260	2979	2806	2806
query60	265	247	251	247
query61	110	109	101	101
query62	881	678	671	671
query63	208	182	184	182
query64	5150	649	622	622
query65	3243	3161	3221	3161
query66	1458	335	302	302
query67	16053	15575	15608	15575
query68	5269	565	560	560
query69	480	249	243	243
query70	1182	1121	1082	1082
query71	341	268	252	252
query72	6300	4040	4173	4040
query73	761	359	365	359
query74	10235	8868	8867	8867
query75	3425	2648	2644	2644
query76	3310	1098	1025	1025
query77	501	276	317	276
query78	10323	9475	9424	9424
query79	2118	604	603	603
query80	1234	418	407	407
query81	542	235	224	224
query82	904	115	122	115
query83	237	147	143	143
query84	235	67	77	67
query85	1329	310	293	293
query86	419	272	309	272
query87	4617	4574	4462	4462
query88	3583	2238	2220	2220
query89	409	288	292	288
query90	2128	183	188	183
query91	143	102	140	102
query92	68	52	50	50
query93	1754	561	560	560
query94	1043	286	300	286
query95	353	246	258	246
query96	607	283	288	283
query97	2835	2688	2653	2653
query98	222	195	199	195
query99	1536	1304	1323	1304
Total cold run time: 303100 ms
Total hot run time: 189468 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.56 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit cf9f3fd3e02ec0f19a412d07a7d8bbe94bcfe29f, data reload: false

query1	0.04	0.02	0.03
query2	0.07	0.04	0.03
query3	0.22	0.08	0.07
query4	1.62	0.10	0.10
query5	0.46	0.42	0.41
query6	1.16	0.64	0.64
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.58	0.51	0.50
query10	0.57	0.57	0.57
query11	0.13	0.10	0.10
query12	0.14	0.11	0.12
query13	0.61	0.60	0.61
query14	2.77	2.80	2.72
query15	0.90	0.82	0.82
query16	0.40	0.37	0.39
query17	1.06	0.97	0.96
query18	0.21	0.21	0.21
query19	1.92	1.84	1.99
query20	0.01	0.01	0.02
query21	15.35	0.61	0.60
query22	2.65	2.37	2.79
query23	17.04	1.02	0.91
query24	2.94	1.54	1.95
query25	0.21	0.31	0.12
query26	0.38	0.12	0.14
query27	0.05	0.05	0.04
query28	9.66	1.11	1.07
query29	12.52	3.24	3.20
query30	0.24	0.06	0.06
query31	2.87	0.37	0.38
query32	3.28	0.46	0.45
query33	3.00	2.99	3.05
query34	16.87	4.41	4.49
query35	4.45	4.40	4.46
query36	0.68	0.48	0.50
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.16	0.13	0.12
query41	0.08	0.02	0.02
query42	0.04	0.02	0.03
query43	0.03	0.03	0.03
Total cold run time: 105.59 s
Total hot run time: 33.56 s

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 12, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

zy-kkk pushed a commit that referenced this pull request Dec 13, 2024
…uirements for rapid Docker startup. (#45269)

Related PR: #45267

Problem Summary:

To meet the needs of rapid Docker startup, I have made adjustments to two related scripts in the Docker startup process. First, I added a env `SKIP_CHECK_ULIMIT` to the `start_be.sh` script, which will skip the size checks for `swap`, `ulimit`, and `max_map_count`. At the same time, I used `--console` to start the process and print logs. The reason why I did not use the `--daemon` daemon command to execute is that starting with a foreground log printing method in a Docker container is the correct and reliable approach.

At the same time, I added a check logic for a `be.conf` configuration item in the `init_be.sh` script: if it is the first time starting, append the export `SKIP_CHECK_ULIMIT=true` to skip the `ulimit` value check in the BE process. In summary, these adjustments can meet the basic requirements for rapid Docker startup usage.
@zy-kkk
Copy link
Member

zy-kkk commented Dec 13, 2024

run buildall

github-actions bot pushed a commit that referenced this pull request Dec 13, 2024
…uirements for rapid Docker startup. (#45269)

Related PR: #45267

Problem Summary:

To meet the needs of rapid Docker startup, I have made adjustments to two related scripts in the Docker startup process. First, I added a env `SKIP_CHECK_ULIMIT` to the `start_be.sh` script, which will skip the size checks for `swap`, `ulimit`, and `max_map_count`. At the same time, I used `--console` to start the process and print logs. The reason why I did not use the `--daemon` daemon command to execute is that starting with a foreground log printing method in a Docker container is the correct and reliable approach.

At the same time, I added a check logic for a `be.conf` configuration item in the `init_be.sh` script: if it is the first time starting, append the export `SKIP_CHECK_ULIMIT=true` to skip the `ulimit` value check in the BE process. In summary, these adjustments can meet the basic requirements for rapid Docker startup usage.
@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.75% (10107/26082)
Line Coverage: 29.69% (84829/285762)
Region Coverage: 28.77% (43565/151443)
Branch Coverage: 25.32% (22130/87410)
Coverage Report: http://coverage.selectdb-in.cc/coverage/d9b2917c3b479635ac970275a59a41cd1392be8f_d9b2917c3b479635ac970275a59a41cd1392be8f/report/index.html

@zy-kkk zy-kkk merged commit 8cba7a2 into apache:master Dec 16, 2024
24 of 28 checks passed
github-actions bot pushed a commit that referenced this pull request Dec 16, 2024
…rt quickly (#45267)

In the storage_engine.cpp of the BE process, it is mandatory to check
that the `ulimit` value must be greater than 60,000. When starting with
Docker or Docker-Compose, it is necessary to preemptively change the
corresponding value on the host machine. This change is very unfriendly
to Docker, as it loses its unique advantage of being able to "build
quickly and anywhere," and cannot become a fundamental capability for
rapid startup. Therefore, a new environment variable has been added to
control whether to skip this check value. The default value is false,
which enforces the check of this value. When set to true, it skips the
check and starts directly.
github-actions bot pushed a commit that referenced this pull request Dec 16, 2024
…rt quickly (#45267)

In the storage_engine.cpp of the BE process, it is mandatory to check
that the `ulimit` value must be greater than 60,000. When starting with
Docker or Docker-Compose, it is necessary to preemptively change the
corresponding value on the host machine. This change is very unfriendly
to Docker, as it loses its unique advantage of being able to "build
quickly and anywhere," and cannot become a fundamental capability for
rapid startup. Therefore, a new environment variable has been added to
control whether to skip this check value. The default value is false,
which enforces the check of this value. When set to true, it skips the
check and starts directly.
yiguolei pushed a commit that referenced this pull request Dec 19, 2024
…ocker to start quickly #45267 (#45468)

Cherry-picked from #45267

Co-authored-by: FreeOnePlus <[email protected]>
morningman pushed a commit that referenced this pull request Dec 23, 2024
…ocker to start quickly #45267 (#45467)

Cherry-picked from #45267

Co-authored-by: FreeOnePlus <[email protected]>
yiguolei pushed a commit that referenced this pull request Dec 25, 2024
…uirements for rapid Docker startup(Merge 2.1). (#45858)

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #45267

Master PR: #45269

Problem Summary:

To meet the needs of rapid Docker startup, I have made adjustments to
two related scripts in the Docker startup process. First, I added a env
`SKIP_CHECK_ULIMIT` to the `start_be.sh` script, which will skip the
size checks for `swap`, `ulimit`, and `max_map_count`. At the same time,
I used `--console` to start the process and print logs. The reason why I
did not use the `--daemon` daemon command to execute is that starting
with a foreground log printing method in a Docker container is the
correct and reliable approach.

At the same time, I added a check logic for a `be.conf` configuration
item in the `init_be.sh` script: if it is the first time starting,
append the export `SKIP_CHECK_ULIMIT=true` to skip the `ulimit` value
check in the BE process. In summary, these adjustments can meet the
basic requirements for rapid Docker startup usage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.8-merged dev/3.0.4-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants