Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](mem) heap-buffer-overflow for function convert_to #46405

Merged
merged 1 commit into from
Jan 6, 2025

Conversation

cambyzju
Copy link
Contributor

@cambyzju cambyzju commented Jan 3, 2025

What problem does this PR solve?

Reproduce SQL with ASAN version:
select convert('装装装装装' using gbk);

Then be crashed:

=================================================================
==1830466==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x606002aeec20 at pc 0x560826fb3e66 bp 0x7fc3816a5890 sp 0x7fc3816a5058

WRITE of size 10 at 0x606002aeec20 thread T711 (brpc_light)

But if we use release version, we found the result is not correct, and the memory maybe already corrupted:

> select convert('装装装装装' using gbk);
+---------------------------------------------------------+
| convert_to('装装装装装', 'gbk')                         |
+---------------------------------------------------------+
| ~zhuangdang~zhuangdang~zhuangdang~zhuangdang~zhu        |
+---------------------------------------------------------+
1 row in set (0.04 sec)

The correct answer should be:

> select convert('装装装装装' using gbk);
+--------------------------------------+
| convert_to('装装装装装', 'gbk')      |
+--------------------------------------+
| ~zhuang~zhuang~zhuang~zhuang~zhuang  |
+--------------------------------------+
1 row in set (0.06 sec)

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 3, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@cambyzju
Copy link
Contributor Author

cambyzju commented Jan 3, 2025

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.88% (10127/26048)
Line Coverage: 29.92% (85701/286405)
Region Coverage: 29.02% (43717/150670)
Branch Coverage: 25.56% (22315/87316)
Coverage Report: http://coverage.selectdb-in.cc/coverage/d940ac512a56cea8223fe39520ffeb433ecb1417_d940ac512a56cea8223fe39520ffeb433ecb1417/report/index.html

be/src/clucene Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that this file should not be updated in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks

@wm1581066 wm1581066 added the p0_c label Jan 4, 2025
@cambyzju
Copy link
Contributor Author

cambyzju commented Jan 5, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32346 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c22c24317aba07289abf055c496aee193049f743, data reload: false

------ Round 1 ----------------------------------
q1	17603	6159	5983	5983
q2	2044	307	167	167
q3	10414	1237	718	718
q4	10215	872	439	439
q5	7525	2157	1978	1978
q6	202	178	144	144
q7	886	729	620	620
q8	9232	1297	1158	1158
q9	5288	4820	4934	4820
q10	6727	2266	1849	1849
q11	464	282	251	251
q12	343	354	225	225
q13	17746	3633	3065	3065
q14	233	218	221	218
q15	545	509	501	501
q16	619	633	582	582
q17	554	837	317	317
q18	7023	6509	6356	6356
q19	1217	950	538	538
q20	319	314	186	186
q21	2745	2103	1934	1934
q22	365	330	297	297
Total cold run time: 102309 ms
Total hot run time: 32346 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6166	6206	6195	6195
q2	236	331	229	229
q3	2182	2601	2341	2341
q4	1377	1821	1353	1353
q5	4336	4748	4631	4631
q6	177	174	137	137
q7	2067	1970	1837	1837
q8	2569	2805	2643	2643
q9	7322	7245	7215	7215
q10	3023	3326	2771	2771
q11	589	513	503	503
q12	676	731	604	604
q13	3401	3848	3199	3199
q14	287	294	295	294
q15	569	499	504	499
q16	641	682	639	639
q17	1184	1719	1250	1250
q18	7646	7519	7293	7293
q19	771	1110	1021	1021
q20	1919	1929	1832	1832
q21	5416	5048	4824	4824
q22	624	593	554	554
Total cold run time: 53178 ms
Total hot run time: 51864 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190695 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c22c24317aba07289abf055c496aee193049f743, data reload: false

query1	950	404	371	371
query2	6526	2267	2207	2207
query3	6708	213	212	212
query4	34706	23740	23763	23740
query5	5073	625	447	447
query6	295	215	188	188
query7	4628	501	312	312
query8	311	247	245	245
query9	9527	2664	2628	2628
query10	487	312	240	240
query11	18598	15589	15167	15167
query12	152	104	101	101
query13	1663	529	397	397
query14	10723	7698	7421	7421
query15	214	198	183	183
query16	7981	571	466	466
query17	1502	733	558	558
query18	2089	402	291	291
query19	232	171	147	147
query20	124	109	108	108
query21	206	120	106	106
query22	4455	4455	4171	4171
query23	34308	33325	34363	33325
query24	7134	2222	2242	2222
query25	445	458	380	380
query26	1084	255	151	151
query27	2004	465	334	334
query28	5247	2443	2384	2384
query29	509	529	401	401
query30	231	181	148	148
query31	987	906	810	810
query32	84	62	60	60
query33	506	342	305	305
query34	732	832	509	509
query35	797	815	733	733
query36	1012	1055	948	948
query37	112	91	73	73
query38	4069	4229	4237	4229
query39	1500	1468	1605	1468
query40	216	115	100	100
query41	46	42	47	42
query42	118	102	102	102
query43	525	514	481	481
query44	1288	812	813	812
query45	179	168	167	167
query46	852	1031	636	636
query47	1902	1907	1832	1832
query48	395	396	349	349
query49	773	481	395	395
query50	607	644	392	392
query51	7178	7218	6987	6987
query52	99	101	91	91
query53	228	253	182	182
query54	476	510	427	427
query55	83	81	80	80
query56	268	254	245	245
query57	1217	1177	1096	1096
query58	245	221	235	221
query59	3114	3018	3096	3018
query60	278	266	251	251
query61	114	127	134	127
query62	854	799	724	724
query63	228	198	195	195
query64	4506	1009	650	650
query65	3297	3226	3250	3226
query66	1010	426	307	307
query67	15999	15717	15630	15630
query68	7105	718	512	512
query69	485	298	259	259
query70	1219	1126	1062	1062
query71	388	294	254	254
query72	6142	3825	3946	3825
query73	637	734	348	348
query74	10055	9209	8936	8936
query75	3358	3118	2656	2656
query76	3386	1147	761	761
query77	755	353	277	277
query78	9953	10004	9636	9636
query79	3215	772	580	580
query80	596	507	432	432
query81	478	276	231	231
query82	588	148	130	130
query83	166	166	183	166
query84	242	91	74	74
query85	752	354	298	298
query86	359	320	311	311
query87	4474	4405	4368	4368
query88	4694	2167	2144	2144
query89	400	336	312	312
query90	1899	188	185	185
query91	136	131	112	112
query92	69	57	54	54
query93	1485	890	540	540
query94	695	395	278	278
query95	334	259	250	250
query96	486	611	278	278
query97	2858	3005	2853	2853
query98	220	207	204	204
query99	1721	1566	1418	1418
Total cold run time: 293356 ms
Total hot run time: 190695 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.47 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c22c24317aba07289abf055c496aee193049f743, data reload: false

query1	0.04	0.06	0.03
query2	0.08	0.03	0.03
query3	0.24	0.07	0.07
query4	1.60	0.11	0.10
query5	0.43	0.41	0.41
query6	1.16	0.66	0.66
query7	0.03	0.02	0.01
query8	0.04	0.04	0.03
query9	0.56	0.52	0.50
query10	0.57	0.56	0.55
query11	0.14	0.10	0.11
query12	0.14	0.11	0.12
query13	0.61	0.59	0.61
query14	2.72	2.73	2.74
query15	0.89	0.83	0.82
query16	0.39	0.37	0.38
query17	0.96	1.07	1.06
query18	0.24	0.22	0.22
query19	1.96	1.77	1.96
query20	0.01	0.01	0.02
query21	15.35	0.91	0.60
query22	0.75	0.78	0.65
query23	15.32	1.40	0.52
query24	2.93	1.19	1.26
query25	0.13	0.15	0.17
query26	0.28	0.15	0.14
query27	0.07	0.06	0.05
query28	14.31	1.53	1.04
query29	12.54	3.96	3.28
query30	0.25	0.09	0.06
query31	2.80	0.61	0.38
query32	3.22	0.53	0.46
query33	3.05	3.07	3.08
query34	16.81	5.11	4.45
query35	4.49	4.51	4.46
query36	0.65	0.50	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.17	0.13	0.12
query41	0.08	0.03	0.02
query42	0.04	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.26 s
Total hot run time: 31.47 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.88% (10127/26048)
Line Coverage: 29.92% (85700/286410)
Region Coverage: 29.02% (43724/150673)
Branch Coverage: 25.55% (22313/87318)
Coverage Report: http://coverage.selectdb-in.cc/coverage/c22c24317aba07289abf055c496aee193049f743_c22c24317aba07289abf055c496aee193049f743/report/index.html

Copy link
Contributor

github-actions bot commented Jan 6, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 6, 2025
Copy link
Contributor

github-actions bot commented Jan 6, 2025

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit 6bfb3c8 into apache:master Jan 6, 2025
26 of 32 checks passed
github-actions bot pushed a commit that referenced this pull request Jan 6, 2025
### What problem does this PR solve?

Reproduce SQL with ASAN version:
`select convert('装装装装装' using gbk);`

Then be crashed:
```
=================================================================
==1830466==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x606002aeec20 at pc 0x560826fb3e66 bp 0x7fc3816a5890 sp 0x7fc3816a5058

WRITE of size 10 at 0x606002aeec20 thread T711 (brpc_light)
```

But if we use release version, we found the result is not correct, and
the memory maybe already corrupted:
```
> select convert('装装装装装' using gbk);
+---------------------------------------------------------+
| convert_to('装装装装装', 'gbk')                         |
+---------------------------------------------------------+
| ~zhuangdang~zhuangdang~zhuangdang~zhuangdang~zhu        |
+---------------------------------------------------------+
1 row in set (0.04 sec)
```

The correct answer should be:
```
> select convert('装装装装装' using gbk);
+--------------------------------------+
| convert_to('装装装装装', 'gbk')      |
+--------------------------------------+
| ~zhuang~zhuang~zhuang~zhuang~zhuang  |
+--------------------------------------+
1 row in set (0.06 sec)
```
cambyzju added a commit to cambyzju/incubator-doris that referenced this pull request Jan 7, 2025
Reproduce SQL with ASAN version:
`select convert('装装装装装' using gbk);`

Then be crashed:
```
=================================================================
==1830466==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x606002aeec20 at pc 0x560826fb3e66 bp 0x7fc3816a5890 sp 0x7fc3816a5058

WRITE of size 10 at 0x606002aeec20 thread T711 (brpc_light)
```

But if we use release version, we found the result is not correct, and
the memory maybe already corrupted:
```
> select convert('装装装装装' using gbk);
+---------------------------------------------------------+
| convert_to('装装装装装', 'gbk')                         |
+---------------------------------------------------------+
| ~zhuangdang~zhuangdang~zhuangdang~zhuangdang~zhu        |
+---------------------------------------------------------+
1 row in set (0.04 sec)
```

The correct answer should be:
```
> select convert('装装装装装' using gbk);
+--------------------------------------+
| convert_to('装装装装装', 'gbk')      |
+--------------------------------------+
| ~zhuang~zhuang~zhuang~zhuang~zhuang  |
+--------------------------------------+
1 row in set (0.06 sec)
```
yiguolei pushed a commit that referenced this pull request Jan 7, 2025
@yiguolei yiguolei removed the dev/2.1.x label Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.x dev/2.1.8-merged dev/3.0.x p0_c reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants