Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for issues 274 and 235 #311

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

theRealAph
Copy link

@theRealAph theRealAph commented Mar 2, 2021

This fixes two bugs caused by race conditions.

The first is in FSTClazzInfo.getSer(), which uses what seems to be a kind of "broken double-checked locking". When multiple threads are racing in getSer(), sometimes FSTSerializerRegistry.NULL is returned, with chaos ensuing. An obvious fix would be to make FSTClazzInfo.getSer() synchronized, but simply making ser volatile and using a local variable is enough to fix it.

The second is in FSTObjectOutput.close(). Here, the codec is modified after it has been returned to the public configury.
This kind of error is trivially detectable if you null out any references you hold to an object after returning it, so I've done that everywhere. Also, this means that there is less work for the garbage collector to do, and we all should be nice to our garbage collectors.

@chrisco484
Copy link

chrisco484 commented Jun 20, 2021

Has anyone tried backporting this to 2.56 / Java 8 and tested?

@theRealAph
Copy link
Author

Has anyone tried backporting this to 2.56 / Java 8 and tested?

As far as I know this patch has been ignored. I'm amazed. This is a longstanding bug, with (as it turns out) a simple and almost-obviously-correct fix.

@chrisco484
Copy link

chrisco484 commented Jun 20, 2021

Has anyone tried backporting this to 2.56 / Java 8 and tested?

As far as I know this patch has been ignored. I'm amazed. This is a longstanding bug, with (as it turns out) a simple and almost-obviously-correct fix.

I've cloned your repo and cherry picked your changes back to the Java-8 branch as we're still using Java 8.

To date we've had to turn off -G1GC garbage collection (which means no String deduplication 👎) and put concurrency guards around all serializing/deserializing operations to avoid any multithreaded access to the FST classes.

It's not so bad at the moment because FST is extremely fast at what it does with the size of data we're serializing but when we scale up this workaround won't cut the mustard. We'll have to remove those guards and then we'll be able to test if your changes have fixed the concurrency issue we were having.

@theRealAph
Copy link
Author

theRealAph commented Jun 21, 2021 via email

@istinnstudio
Copy link

istinnstudio commented Dec 31, 2022

needs investigation, sorry but I cannot test it further as I use a brutally modded 2.57 version to support java 17 APIs. But it is not that obvious that it would work on every serialization case. This is just a note, not a proper test, maybe this patch is valuable for others.

Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTObjectRegistry.clearForWrite(org.nustaq.serialization.FSTConfiguration)" because "this.objects" is null

Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTClazzNameRegistry.clear()" because "this.clnames" is null

istinnstudio pushed a commit to istinnstudio/fast-serialization-2.57-java17 that referenced this pull request Jan 1, 2023
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTObjectRegistry.clearForWrite(org.nustaq.serialization.FSTConfiguration)" because "this.objects" is null
---
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTClazzNameRegistry.clear()" because "this.clnames" is null
istinnstudio pushed a commit to istinnstudio/fast-serialization-2.57-java17 that referenced this pull request Jan 1, 2023
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTObjectRegistry.clearForWrite(org.nustaq.serialization.FSTConfiguration)" because "this.objects" is null
---
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTClazzNameRegistry.clear()" because "this.clnames" is null
@theRealAph
Copy link
Author

needs investigation, sorry but I cannot test it further as I use a brutally modded 2.57 version to support java 17 APIs. But it is not that obvious that it would work on every serialization case. This is just a note, not a proper test, maybe this patch is valuable for others.

Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTObjectRegistry.clearForWrite(org.nustaq.serialization.FSTConfiguration)" because "this.objects" is null

Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTClazzNameRegistry.clear()" because "this.clnames" is null

Understood, but this patch fixes some concurrency bugs, even if it doesn't fix them all. So we might as well push it.

If anyone can find a reproducer for more concurrency bugs I can fix them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants