Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcmalloc causes crash during throwing of OpenFHE exception #761

Closed
j2kun opened this issue May 2, 2024 · 0 comments · Fixed by #794
Closed

tcmalloc causes crash during throwing of OpenFHE exception #761

j2kun opened this issue May 2, 2024 · 0 comments · Fixed by #794
Assignees
Milestone

Comments

@j2kun
Copy link

j2kun commented May 2, 2024

I have a bit of a strange situation. Google internally uses clang+tcmalloc by default for all its builds, and in v1.1.4 I've encountered a few crashes that occur whenever an OpenFHE exception is thrown, with a trace like this:

2216 third_party/tcmalloc/tcmalloc.cc:909] size check failed for 0x33c1bfc3e000: claimed 8, actual 1024, class 1
2216 third_party/tcmalloc/tcmalloc.cc:854] CHECK in do_free_with_size: CorrectSize(ptr, size, align) (false)                       
*** SIGABRT received by PID 2216 (TID 2216) on cpu 11 from PID 2216; stack trace: ***                                                                                                          
PC: @     0x7f86ec862347  (unknown)  gsignal                                                   
    @     0x7f86cfed4735       2544  base/process_state.cc:1237 FailureSignalHandler()
    @     0x7f877193d1c0  1281657408  (unknown)                          
    @     0x7f86c0b8b314        912  third_party/tcmalloc/internal/logging.cc:233 tcmalloc::tcmalloc_internal::Crash()                                                                         
    @     0x7f86c0b8ae1d         48  third_party/tcmalloc/internal/logging.cc:238 tcmalloc::tcmalloc_internal::CheckFailed()
    @     0x559c9eb7f962        688  ./third_party/tcmalloc/internal/logging.h:148 tcmalloc::tcmalloc_internal::CheckFailed<>()
    @     0x559c9eac6b1c       2640  third_party/tcmalloc/tcmalloc.cc:854 TCMallocInternalDeleteArraySized
    @     0x7f877128ba93         48  third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:73 std::__u::default_delete<>::operator()()           
    @     0x7f877128b874         64  third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:262 std::__u::unique_ptr<>::~unique_ptr()
    @     0x7f877128b6c6        128  third_party/openfhe/src/core/lib/utils/demangle.cpp:42 demangle()
    @     0x7f877128beb6       4384  third_party/openfhe/src/core/lib/utils/get-call-stack.cpp:84 get_call_stack()
    @     0x7f87743dae0d        352  third_party/openfhe/src/core/include/utils/exception.h:179 lbcrypto::OpenFHEException::OpenFHEException()
    @     0x7f87743e6a74        592  third_party/openfhe/src/core/include/math/nbtheory-impl.h:191 lbcrypto::RootOfUnity<>()                                                                   
    @     0x7f8773092189        544  third_party/openfhe/src/pke/lib/encoding/packedencoding.cpp:493 lbcrypto::PackedEncoding::SetParams_2n()
    @     0x7f8773090b17        912  third_party/openfhe/src/pke/lib/encoding/packedencoding.cpp:241 lbcrypto::PackedEncoding::SetParams()
    @     0x7f87730941a0        816  third_party/openfhe/src/pke/lib/encoding/packedencoding.cpp:329 lbcrypto::PackedEncoding::Pack<>()
    @     0x7f877308eb4f       6752  third_party/openfhe/src/pke/lib/encoding/packedencoding.cpp:117 lbcrypto::PackedEncoding::Encode()
    @     0x7f877442661d        624  ./third_party/openfhe/src/pke/include/encoding/plaintextfactory.h:100 lbcrypto::PlaintextFactory::MakePlaintext<>()
    @     0x7f8774425891        992  ./third_party/openfhe/src/pke/include/cryptocontext.h:246 lbcrypto::CryptoContextImpl<>::MakePlaintext()
    @     0x7f87743cb516        176  ./third_party/openfhe/src/pke/include/cryptocontext.h:1018 lbcrypto::CryptoContextImpl<>::MakePackedPlaintext()

The error comes from here: https://github.com/google/tcmalloc/blob/7d59e25cd84cdce95f137b79466dd4c4d56e6ff2/tcmalloc/tcmalloc.cc#L765

I've found it's easy to reproduce the exception being thrown by, say, using a prime plaintext modulus that does not satisfy the correct divisibility condition m divides (q-1). See the patch below for an example:

diff --git a/src/pke/examples/simple-integers-bgvrns.cpp b/src/pke/examples/simple-integers-bgvrns.cpp
index aaeed9c..d3fd960 100644
--- a/src/pke/examples/simple-integers-bgvrns.cpp
+++ b/src/pke/examples/simple-integers-bgvrns.cpp
@@ -41,7 +41,7 @@ int main() {
     // Sample Program: Step 1 - Set CryptoContext
     CCParams<CryptoContextBGVRNS> parameters;
     parameters.SetMultiplicativeDepth(2);
-    parameters.SetPlaintextModulus(65537);
+    parameters.SetPlaintextModulus(131101);  // a prime with bad divisibility
 
     CryptoContext<DCRTPoly> cryptoContext = GenCryptoContext(parameters);
     // Enable features that you wish to use

But I am not able to reproduce the actual trace in the CMake build. My attempt (v1.1.4 94fd76a):

  1. Apply the patch above
  2. Configure with tcmalloc enabled
    mkdir build && cd build
    cmake .. -DWITH_TCM=ON -DBUILD_EXAMPLES=ON -DCMAKE_BUILD_TYPE=Debug
    make tcm
    make -j 25
    
  3. Run bin/examples/pke/simple-integers-bgvrns

As with last time, I suspect the issue is in differing compiler flags. A few stand out: -fsized-deallocation, -fno-exceptions

Here is the complete list

How would I test these compiler flags in the CMake config to see if I can reproduce this? Any idea what could be the root cause here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants