Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](docker)Modify the init_be and start_be scripts to meet the requirements for rapid Docker startup. #45269

Merged
merged 34 commits into from
Dec 13, 2024

Conversation

FreeOnePlus
Copy link
Contributor

@FreeOnePlus FreeOnePlus commented Dec 10, 2024

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #45267

Problem Summary:

To meet the needs of rapid Docker startup, I have made adjustments to two related scripts in the Docker startup process. First, I added a env SKIP_CHECK_ULIMIT to the start_be.sh script, which will skip the size checks for swap, ulimit, and max_map_count. At the same time, I used --console to start the process and print logs. The reason why I did not use the --daemon daemon command to execute is that starting with a foreground log printing method in a Docker container is the correct and reliable approach.

At the same time, I added a check logic for a be.conf configuration item in the init_be.sh script: if it is the first time starting, append the export SKIP_CHECK_ULIMIT=true to skip the ulimit value check in the BE process. In summary, these adjustments can meet the basic requirements for rapid Docker startup usage.

Release note

None

Check List (For Author)

  • Test
    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  1. Set the ulimit, swap, and max_map_count check values on the host machine to values that do not meet the requirements.
  2. Run export SKIP_CHECK_ULIMIT=true and sh start_be.sh --console to start the BE process, which starts up normally.
  3. Run export SKIP_CHECK_ULIMIT=false,then using the --daemon or --console option to start, the BE process fails to start normally.
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In bin/start_be.sh line 192:
if [[ "${RUN_SKIP}" -eq 0 ]]]; then
^-- SC1073 (error): Couldn't parse this if expression. Fix to allow more checks.
                            ^-- SC1050 (error): Expected 'then'.
                            ^-- SC1072 (error): Expected 'then'. Fix any mentioned problems and try again.
                            ^-- SC1140 (error): Unexpected parameters after condition. Missing &&/||, or bad expression?

For more information:
  https://www.shellcheck.net/wiki/SC1050 -- Expected 'then'.
  https://www.shellcheck.net/wiki/SC1140 -- Unexpected parameters after condi...
  https://www.shellcheck.net/wiki/SC1072 -- Expected 'then'. Fix any mentione...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors

'shfmt ' returned error 1 finding the following formatting issues:

----------
bin/start_be.sh:192:27: not a valid test operator: ]]]
----------

You can reformat the above files to meet shfmt's requirements by typing:

  shfmt  -w filename


Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors
'shellcheck ' found no issues.

shfmt errors

'shfmt ' returned error 1 finding the following formatting issues:

----------
--- bin/start_be.sh.orig
+++ bin/start_be.sh
@@ -190,28 +190,28 @@
 fi
 
 if [[ "${RUN_SKIP}" -eq 0 ]]; then
-  if [[ "$(uname -s)" != 'Darwin' ]]; then
-      MAX_MAP_COUNT="$(cat /proc/sys/vm/max_map_count)"
-      if [[ "${MAX_MAP_COUNT}" -lt 2000000 ]]; then
-          echo "Set kernel parameter 'vm.max_map_count' to a value greater than 2000000, example: 'sysctl -w vm.max_map_count=2000000'"
-          exit 1
-      fi
+    if [[ "$(uname -s)" != 'Darwin' ]]; then
+        MAX_MAP_COUNT="$(cat /proc/sys/vm/max_map_count)"
+        if [[ "${MAX_MAP_COUNT}" -lt 2000000 ]]; then
+            echo "Set kernel parameter 'vm.max_map_count' to a value greater than 2000000, example: 'sysctl -w vm.max_map_count=2000000'"
+            exit 1
+        fi
 
-      if [[ "$(swapon -s | wc -l)" -gt 1 ]]; then
-          echo "Disable swap memory before starting be"
-          exit 1
-      fi
-  fi
+        if [[ "$(swapon -s | wc -l)" -gt 1 ]]; then
+            echo "Disable swap memory before starting be"
+            exit 1
+        fi
+    fi
 
-  MAX_FILE_COUNT="$(ulimit -n)"
-  if [[ "${MAX_FILE_COUNT}" -lt 60000 ]]; then
-      echo "Set max number of open file descriptors to a value greater than 60000."
-      echo "Ask your system manager to modify /etc/security/limits.conf and append content like"
-      echo "  * soft nofile 655350"
-      echo "  * hard nofile 655350"
-      echo "and then run 'ulimit -n 655350' to take effect on current session."
-      exit 1
-  fi
+    MAX_FILE_COUNT="$(ulimit -n)"
+    if [[ "${MAX_FILE_COUNT}" -lt 60000 ]]; then
+        echo "Set max number of open file descriptors to a value greater than 60000."
+        echo "Ask your system manager to modify /etc/security/limits.conf and append content like"
+        echo "  * soft nofile 655350"
+        echo "  * hard nofile 655350"
+        echo "and then run 'ulimit -n 655350' to take effect on current session."
+        exit 1
+    fi
 fi
 
 # add java libs
----------

You can reformat the above files to meet shfmt's requirements by typing:

  shfmt  -w filename


@FreeOnePlus
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.80% (10106/26046)
Line Coverage: 29.69% (84720/285307)
Region Coverage: 28.77% (43497/151206)
Branch Coverage: 25.32% (22099/87264)
Coverage Report: http://coverage.selectdb-in.cc/coverage/a6ac46e28bf3d381b1fab57497913e98960d4b5b_a6ac46e28bf3d381b1fab57497913e98960d4b5b/report/index.html

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Dec 12, 2024
Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In bin/start_be.sh line 185:
IS_SKIP_CHECK_ULIMIT=${SKIP_CHECK_ULIMIT}
                     ^------------------^ SC2154 (warning): SKIP_CHECK_ULIMIT is referenced but not assigned.

For more information:
  https://www.shellcheck.net/wiki/SC2154 -- SKIP_CHECK_ULIMIT is referenced b...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors
'shfmt ' found no issues.

Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors
'shellcheck ' found no issues.

shfmt errors

'shfmt ' returned error 1 finding the following formatting issues:

----------
--- bin/start_be.sh.orig
+++ bin/start_be.sh
@@ -182,8 +182,6 @@
     exit 0
 fi
 
-
-
 if [[ "${SKIP_CHECK_ULIMIT:-"false"}" != "true" ]]; then
     if [[ "$(uname -s)" != 'Darwin' ]]; then
         MAX_MAP_COUNT="$(cat /proc/sys/vm/max_map_count)"
----------

You can reformat the above files to meet shfmt's requirements by typing:

  shfmt  -w filename


@FreeOnePlus
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.75% (10103/26073)
Line Coverage: 29.68% (84747/285526)
Region Coverage: 28.76% (43514/151325)
Branch Coverage: 25.31% (22104/87330)
Coverage Report: http://coverage.selectdb-in.cc/coverage/334a99ad6467b90754befa25e7520bafa0a1e93e_334a99ad6467b90754befa25e7520bafa0a1e93e/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 40034 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 334a99ad6467b90754befa25e7520bafa0a1e93e, data reload: false

------ Round 1 ----------------------------------
q1	17595	7368	7214	7214
q2	2049	178	173	173
q3	10843	1108	1228	1108
q4	10230	710	633	633
q5	7605	2655	2695	2655
q6	237	146	144	144
q7	1010	626	620	620
q8	9270	1896	1925	1896
q9	6627	6443	6479	6443
q10	6999	2293	2304	2293
q11	481	255	259	255
q12	473	222	225	222
q13	17813	3029	3029	3029
q14	250	216	219	216
q15	573	512	507	507
q16	669	572	583	572
q17	986	625	614	614
q18	7196	6759	6683	6683
q19	1335	967	985	967
q20	469	187	180	180
q21	4664	3360	3297	3297
q22	384	313	314	313
Total cold run time: 107758 ms
Total hot run time: 40034 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7306	7294	7298	7294
q2	328	228	226	226
q3	2959	2896	2994	2896
q4	2115	1932	1946	1932
q5	5688	5662	5716	5662
q6	227	141	145	141
q7	2221	1870	1789	1789
q8	3410	3605	3574	3574
q9	9040	9302	9305	9302
q10	3695	3656	3607	3607
q11	622	502	513	502
q12	866	672	656	656
q13	16496	3208	3313	3208
q14	324	267	274	267
q15	585	529	520	520
q16	696	644	657	644
q17	1877	1676	1644	1644
q18	8347	7681	7688	7681
q19	1728	1548	1417	1417
q20	2079	1896	1864	1864
q21	5753	5474	5557	5474
q22	649	552	577	552
Total cold run time: 77011 ms
Total hot run time: 60852 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196721 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 334a99ad6467b90754befa25e7520bafa0a1e93e, data reload: false

query1	1330	970	943	943
query2	6236	2040	2060	2040
query3	11056	4687	4600	4600
query4	67395	28886	23322	23322
query5	5008	489	475	475
query6	427	198	178	178
query7	5664	313	301	301
query8	341	240	245	240
query9	9460	2739	2735	2735
query10	463	275	279	275
query11	17587	15227	15698	15227
query12	151	105	113	105
query13	1577	444	448	444
query14	11039	7233	7449	7233
query15	211	190	200	190
query16	7367	470	527	470
query17	1123	625	563	563
query18	1442	308	309	308
query19	236	157	174	157
query20	117	112	118	112
query21	232	97	105	97
query22	4659	4723	4432	4432
query23	34288	33788	33996	33788
query24	5615	2628	2426	2426
query25	497	385	394	385
query26	637	161	163	161
query27	1829	281	293	281
query28	4305	2509	2487	2487
query29	670	439	419	419
query30	214	153	148	148
query31	1005	827	852	827
query32	74	56	54	54
query33	395	306	296	296
query34	957	529	535	529
query35	868	757	748	748
query36	1083	980	1012	980
query37	140	80	73	73
query38	4470	4389	4409	4389
query39	1513	1462	1458	1458
query40	201	102	98	98
query41	47	41	45	41
query42	117	101	101	101
query43	561	544	516	516
query44	1234	864	848	848
query45	199	171	174	171
query46	1181	772	744	744
query47	2046	1916	1967	1916
query48	432	325	338	325
query49	730	411	391	391
query50	865	410	409	409
query51	7379	7315	7105	7105
query52	103	90	91	90
query53	255	179	188	179
query54	525	402	409	402
query55	79	82	79	79
query56	254	271	245	245
query57	1247	1083	1098	1083
query58	230	217	219	217
query59	3280	3219	3023	3023
query60	263	247	248	247
query61	112	111	106	106
query62	783	672	662	662
query63	223	191	190	190
query64	1428	688	667	667
query65	3257	3168	3199	3168
query66	634	292	294	292
query67	15989	15929	15505	15505
query68	4111	583	574	574
query69	433	251	255	251
query70	1231	1152	1158	1152
query71	360	250	253	250
query72	6382	4042	4133	4042
query73	779	366	361	361
query74	10130	8907	8887	8887
query75	3367	2638	2640	2638
query76	1958	1087	1182	1087
query77	472	280	281	280
query78	10495	9477	9402	9402
query79	1178	612	603	603
query80	819	434	450	434
query81	493	229	240	229
query82	1338	121	121	121
query83	272	141	145	141
query84	290	69	71	69
query85	863	311	305	305
query86	317	296	295	295
query87	4713	4557	4597	4557
query88	3522	2251	2212	2212
query89	425	293	300	293
query90	2034	187	184	184
query91	143	103	105	103
query92	59	49	50	49
query93	1324	549	545	545
query94	765	296	290	290
query95	353	251	263	251
query96	607	276	291	276
query97	2838	2682	2638	2638
query98	230	195	199	195
query99	1601	1332	1291	1291
Total cold run time: 319237 ms
Total hot run time: 196721 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.77 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 334a99ad6467b90754befa25e7520bafa0a1e93e, data reload: false

query1	0.03	0.03	0.05
query2	0.06	0.04	0.02
query3	0.24	0.08	0.06
query4	1.63	0.10	0.10
query5	0.44	0.42	0.39
query6	1.16	0.66	0.64
query7	0.02	0.02	0.01
query8	0.04	0.04	0.03
query9	0.56	0.52	0.50
query10	0.56	0.59	0.55
query11	0.14	0.11	0.10
query12	0.14	0.11	0.11
query13	0.61	0.59	0.60
query14	2.81	2.88	2.75
query15	0.90	0.83	0.82
query16	0.39	0.39	0.39
query17	1.06	1.02	1.01
query18	0.23	0.20	0.20
query19	1.92	1.84	1.97
query20	0.02	0.01	0.01
query21	15.36	0.59	0.58
query22	2.39	2.36	1.76
query23	16.83	1.21	0.84
query24	3.48	1.66	1.24
query25	0.28	0.05	0.07
query26	0.59	0.14	0.13
query27	0.05	0.04	0.05
query28	9.93	1.11	1.06
query29	12.60	3.31	3.28
query30	0.24	0.06	0.06
query31	2.88	0.37	0.38
query32	3.28	0.45	0.46
query33	3.03	3.02	3.02
query34	17.08	4.46	4.53
query35	4.52	4.50	4.49
query36	0.67	0.49	0.51
query37	0.09	0.06	0.06
query38	0.05	0.03	0.03
query39	0.04	0.02	0.03
query40	0.17	0.12	0.12
query41	0.08	0.02	0.03
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 106.66 s
Total hot run time: 32.77 s

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 12, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@zy-kkk zy-kkk merged commit ad58cd1 into apache:master Dec 13, 2024
29 of 31 checks passed
github-actions bot pushed a commit that referenced this pull request Dec 13, 2024
…uirements for rapid Docker startup. (#45269)

Related PR: #45267

Problem Summary:

To meet the needs of rapid Docker startup, I have made adjustments to two related scripts in the Docker startup process. First, I added a env `SKIP_CHECK_ULIMIT` to the `start_be.sh` script, which will skip the size checks for `swap`, `ulimit`, and `max_map_count`. At the same time, I used `--console` to start the process and print logs. The reason why I did not use the `--daemon` daemon command to execute is that starting with a foreground log printing method in a Docker container is the correct and reliable approach.

At the same time, I added a check logic for a `be.conf` configuration item in the `init_be.sh` script: if it is the first time starting, append the export `SKIP_CHECK_ULIMIT=true` to skip the `ulimit` value check in the BE process. In summary, these adjustments can meet the basic requirements for rapid Docker startup usage.
dataroaring pushed a commit that referenced this pull request Dec 19, 2024
…meet the requirements for rapid Docker startup. #45269 (#45402)

Cherry-picked from #45269

Co-authored-by: FreeOnePlus <[email protected]>
FreeOnePlus added a commit to FreeOnePlus/doris that referenced this pull request Dec 24, 2024
yiguolei pushed a commit that referenced this pull request Dec 25, 2024
…uirements for rapid Docker startup(Merge 2.1). (#45858)

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #45267

Master PR: #45269

Problem Summary:

To meet the needs of rapid Docker startup, I have made adjustments to
two related scripts in the Docker startup process. First, I added a env
`SKIP_CHECK_ULIMIT` to the `start_be.sh` script, which will skip the
size checks for `swap`, `ulimit`, and `max_map_count`. At the same time,
I used `--console` to start the process and print logs. The reason why I
did not use the `--daemon` daemon command to execute is that starting
with a foreground log printing method in a Docker container is the
correct and reliable approach.

At the same time, I added a check logic for a `be.conf` configuration
item in the `init_be.sh` script: if it is the first time starting,
append the export `SKIP_CHECK_ULIMIT=true` to skip the `ulimit` value
check in the BE process. In summary, these adjustments can meet the
basic requirements for rapid Docker startup usage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.8-merged dev/3.0.4-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants