Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix buffer overflow in volk_32fc_x2_square_dist_32f_a_sse3 #645

Merged
merged 1 commit into from
Oct 22, 2023

Conversation

argilo
Copy link
Member

@argilo argilo commented Oct 13, 2023

When the vector length is less than 4, volk_32fc_x2_square_dist_32f_a_sse3 performs an out-of-bounds read. Its main loop is set up to fetch data one iteration ahead, but the edge case is not considered and the last iteration is executed unconditionally.

I tried removing the pre-fetch logic like in #565, but performance was reduced. So I added an if statement around the last iteration instead.

Valgrind report:

==649543== Invalid read of size 16
==649543==    at 0x4B371B1: _mm_load_ps (xmmintrin.h:927)
==649543==    by 0x4B371B1: volk_32fc_x2_square_dist_32f_a_sse3 (volk_32fc_x2_square_dist_32f.h:221)
==649543==    by 0x48D982F: volk_32fc_x2_square_dist_32f_manual (volk.c:9276)
==649543==    by 0x12ADE2: run_cast_test3(void (*)(void*, void*, void*, unsigned int, char const*), std::vector<void*, std::allocator<void*> >&, unsigned int, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:279)
==649543==    by 0x128765: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, float, std::complex<float>, unsigned int, unsigned int, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) (qa_utils.cc:672)
==649543==    by 0x127809: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, volk_test_params_t, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:507)
==649543==    by 0x11E0DD: main (volk_profile.cc:139)
==649543==  Address 0x5401d00 is 0 bytes after a block of size 32 alloc'd
==649543==    at 0x484E120: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==649543==    by 0x48BF2C9: volk_malloc (volk_malloc.c:71)
==649543==    by 0x12B2FA: volk_qa_aligned_mem_pool::get_new(unsigned long) (qa_utils.cc:484)
==649543==    by 0x127F8D: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, float, std::complex<float>, unsigned int, unsigned int, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) (qa_utils.cc:602)
==649543==    by 0x127809: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, volk_test_params_t, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:507)
==649543==    by 0x11E0DD: main (volk_profile.cc:139)
==649543== 
==649543== Invalid read of size 4
==649543==    at 0x4B372AD: volk_32fc_x2_square_dist_32f_a_sse3 (volk_32fc_x2_square_dist_32f.h:236)
==649543==    by 0x48D982F: volk_32fc_x2_square_dist_32f_manual (volk.c:9276)
==649543==    by 0x12ADE2: run_cast_test3(void (*)(void*, void*, void*, unsigned int, char const*), std::vector<void*, std::allocator<void*> >&, unsigned int, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:279)
==649543==    by 0x128765: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, float, std::complex<float>, unsigned int, unsigned int, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) (qa_utils.cc:672)
==649543==    by 0x127809: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, volk_test_params_t, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:507)
==649543==    by 0x11E0DD: main (volk_profile.cc:139)
==649543==  Address 0x5401d10 is 16 bytes after a block of size 32 alloc'd
==649543==    at 0x484E120: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==649543==    by 0x48BF2C9: volk_malloc (volk_malloc.c:71)
==649543==    by 0x12B2FA: volk_qa_aligned_mem_pool::get_new(unsigned long) (qa_utils.cc:484)
==649543==    by 0x127F8D: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, float, std::complex<float>, unsigned int, unsigned int, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) (qa_utils.cc:602)
==649543==    by 0x127809: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, volk_test_params_t, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:507)
==649543==    by 0x11E0DD: main (volk_profile.cc:139)
==649543== 
==649543== Invalid read of size 4
==649543==    at 0x4B372B8: volk_32fc_x2_square_dist_32f_a_sse3 (volk_32fc_x2_square_dist_32f.h:236)
==649543==    by 0x48D982F: volk_32fc_x2_square_dist_32f_manual (volk.c:9276)
==649543==    by 0x12ADE2: run_cast_test3(void (*)(void*, void*, void*, unsigned int, char const*), std::vector<void*, std::allocator<void*> >&, unsigned int, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:279)
==649543==    by 0x128765: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, float, std::complex<float>, unsigned int, unsigned int, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) (qa_utils.cc:672)
==649543==    by 0x127809: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, volk_test_params_t, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:507)
==649543==    by 0x11E0DD: main (volk_profile.cc:139)
==649543==  Address 0x5401d14 is 20 bytes after a block of size 32 alloc'd
==649543==    at 0x484E120: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==649543==    by 0x48BF2C9: volk_malloc (volk_malloc.c:71)
==649543==    by 0x12B2FA: volk_qa_aligned_mem_pool::get_new(unsigned long) (qa_utils.cc:484)
==649543==    by 0x127F8D: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, float, std::complex<float>, unsigned int, unsigned int, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) (qa_utils.cc:602)
==649543==    by 0x127809: run_volk_tests(volk_func_desc, void (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, volk_test_params_t, std::vector<volk_test_results_t, std::allocator<volk_test_results_t> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (qa_utils.cc:507)
==649543==    by 0x11E0DD: main (volk_profile.cc:139)
==649543== 

Copy link
Contributor

@jdemel jdemel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This kernel is hard to read. These interleaved loads and stores are difficult to deal with.

Do you have benchmark numbers?

@argilo
Copy link
Member Author

argilo commented Oct 14, 2023

Actually it may have been poor benchmarking on my part (frequency scaling getting in the way). I tested with more iterations and under the same conditions, and removing the pre-fetch seems no worse. I've updated the PR accordingly.

@argilo
Copy link
Member Author

argilo commented Oct 14, 2023

Before:

RUN_VOLK_TESTS: volk_32fc_x2_square_dist_32f(131071,100000)
a_avx2 completed in 3308.81 ms
a_sse3 completed in 4039.83 ms
generic completed in 4848.9 ms
u_avx2 completed in 3495.01 ms

After:

RUN_VOLK_TESTS: volk_32fc_x2_square_dist_32f(131071,100000)
a_avx2 completed in 3314.61 ms
a_sse3 completed in 3996.41 ms
generic completed in 4825.26 ms
u_avx2 completed in 3359.85 ms

@argilo
Copy link
Member Author

argilo commented Oct 14, 2023

The test failures are unrelated; all due to #646.

@jdemel jdemel merged commit d92e2ba into gnuradio:main Oct 22, 2023
29 of 32 checks passed
@argilo argilo deleted the fix-square-dist branch December 2, 2023 16:50
Alesha72003 pushed a commit to Alesha72003/volk that referenced this pull request May 15, 2024
Fix buffer overflow in volk_32fc_x2_square_dist_32f_a_sse3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants