-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency-related deserialization issue when G1GC is enabled #274
Comments
I've been using FST for a few weeks now and it's awesome! I just turned on G1GC garbage collection for the first time this morning and now I'm seeing this exception for the first time:
in heavily concurrent load testing so it is likely related to the bug reported on this page. See my comment at #282 I'm hesitant to turn off G1GC because it is mandatory for String deduplication which I need turned on because my app processes user data which has millions of identical Strings. Any progress on the bug? |
Thanks for your extensive report, unfortunately i am very busy. Anyways as FST is used in our current software I will investigate as soon as possible (TM). |
If you run with java 8 its possible to turn off Unsafe Usage which will throw an exception instead of sigsev. Unfortunately Java 9 and higher forces use of Unsafe, though |
@RuedigerMoeller As a workaround for the concurrent deserialization issue I'm thinking that I can just not use the stream from Conf that is designed for reuse and just create a new stream each time. Does this sound reasonable, performance wise for my scenario? : For one application I only ever have no more than about 3 objects that I serialize in one stream at any time. Presumably the reuse of existing stream was to save building up the class table/index each time but with only 3 classes in any stream that's probably not going to be an issue. Presumably I can then have multiple streams taking place at the same time because they are not attempting to share a stream with a shared class table/index. Right now I have to synchronize access to my deserialization methods to avoid the concurrency issue which is causing a massive performance bottleneck because, even though I only have 3 objects in the stream one of them can often be a many megabyte byte array. |
Looking forward to that. 🙂 It's not the end of the world for us, we have FST (de)serialization behind a configuration flag for now and can easily re-enable it if/when this bug gets resolved.
We are still on Java 8, so disabling unsafe is possible. From what I've seen though, there are still some cases where FST uses unsafe operations even when unsafe is turned off. Quoting FSTUtil.java: public static Unsafe unFlaggedUnsafe = FSTUtil.getUnsafe(); // even if unsafe is disabled, use it for memoffset computation I cannot tell 100% for sure at this point if we've tried disabling unsafe in production or not (maybe @slovdahl remembers?), and if it made any difference re. the |
This test case passes for me with the following changes to the test:
Essentially, a ThreadLocal FSTConfiguration. |
Should be fixed by #311 . Please check. |
Hi,
I've spent a lot of time debugging this issue lately, which started out as random
SIGSEGV
s on some of our customer servers. Here are the full details, quoting from the README.md in the repro repository.Please let me know if you have any suggestions or need more info, I'll happily help with the debugging of this.
fst-concurrency-issue
Demonstrates a problem in the fst library.
Details & background
#235 describes an issue related to how the
FSTConfiguration
class in fst is not thread safe.#270 describes another issue with a user experiencing
SIGSEGV
errors in the Java process.We were seeing the latter in our application while put under load, which led us to start investigating this further. Here are the results of this investigation. This repository illustrates how to reproduce our current bug, which is only displayed when using one of the following:
-XX:+UseG1GC
This repro project does not cause the JVM to crash, but we feel confident enough anyway that our crashes have the same root cause as these (managed) exceptions. When excluding FST from our application, using another serialization library, our SIGSEGV errors go away completely.
How to run
$ ./gradlew test
(or load the project in IntelliJ IDEA/some other Java IDE, and use its test runner).When run multiple times, the problem is exhibited. From my experience, when running it 10 times it will fail about 3-4 times with fst 2.57.
(If running on the command line, the exceptions will be logged to an HTML file; normally,
build/reports/tests/test/index.html
- its location will be printed by Gradle. Full exception details are not printed to stdout.)I also made it easy to test this with older versions of fst - see build.gradle for more details. From what I can tell, the bug seems to be present on all 2.xx versions of fst.
Disabling parallelization
I experimented with setting
NUMBER_OF_THREADS
to 1, i.e. minimize parallelism, since we were thinking that the root cause here might not be strictly concurrency-related but rather triggered by unexpected GC behavior (= fst semantics only working reliably with ParallelGC, the default GC in Java 8).Interestingly enough, I could not reproduce the errors this way, which leads to the conclusion that the error is likely to be a concurrency bug after all.
Example exceptions
Provided for convenience only; these were the exceptions I noted when running the tests now about 10-15 times.
java.io.IOException: Failed to read the next byte
java.lang.NullPointerException at FSTObjectInput.java:357
java.lang.RuntimeException: unknown object tag -19
The text was updated successfully, but these errors were encountered: