-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for issues 274 and 235 #311
base: master
Are you sure you want to change the base?
Conversation
Has anyone tried backporting this to 2.56 / Java 8 and tested? |
As far as I know this patch has been ignored. I'm amazed. This is a longstanding bug, with (as it turns out) a simple and almost-obviously-correct fix. |
I've cloned your repo and cherry picked your changes back to the Java-8 branch as we're still using Java 8. To date we've had to turn off -G1GC garbage collection (which means no String deduplication 👎) and put concurrency guards around all serializing/deserializing operations to avoid any multithreaded access to the FST classes. It's not so bad at the moment because FST is extremely fast at what it does with the size of data we're serializing but when we scale up this workaround won't cut the mustard. We'll have to remove those guards and then we'll be able to test if your changes have fixed the concurrency issue we were having. |
On 6/21/21 12:53 AM, Christopher Colemani wrote:
I've cloned your repo and cherry picked your changes back to the Java-8 branch as we're still using Java 8.
To date we've had to turn off -G1GC garbage collection (which means no String deduplication 👎) and put concurrency guards around an serializing/deserializing operations to avoid any multithreaded access to the FST classes.
It's not so bad at the moment because FST is extremely fast at what it does and with the size of data we're serializing but when we scale up this workaround won't cut the mustard. We'll have to remove those guards and then we'll be able to test if your changes have fixed the concurrency issue we were having.
OK, thanks. From what I see of the comments people have more or less given
up on the idea of fixing the concurrency bugs, But I don't think it's so
bad. I dug into this because a customer was having crashes which looked
like a JVM bug, but it turned our that FST was the problem. It'd be nice
to get this fixed.
…--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
|
needs investigation, sorry but I cannot test it further as I use a brutally modded 2.57 version to support java 17 APIs. But it is not that obvious that it would work on every serialization case. This is just a note, not a proper test, maybe this patch is valuable for others.
|
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTObjectRegistry.clearForWrite(org.nustaq.serialization.FSTConfiguration)" because "this.objects" is null --- Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTClazzNameRegistry.clear()" because "this.clnames" is null
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTObjectRegistry.clearForWrite(org.nustaq.serialization.FSTConfiguration)" because "this.objects" is null --- Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: Cannot invoke "org.nustaq.serialization.FSTClazzNameRegistry.clear()" because "this.clnames" is null
Understood, but this patch fixes some concurrency bugs, even if it doesn't fix them all. So we might as well push it. If anyone can find a reproducer for more concurrency bugs I can fix them. |
This fixes two bugs caused by race conditions.
The first is in
FSTClazzInfo.getSer()
, which uses what seems to be a kind of "broken double-checked locking". When multiple threads are racing ingetSer()
, sometimesFSTSerializerRegistry.NULL
is returned, with chaos ensuing. An obvious fix would be to makeFSTClazzInfo.getSer()
synchronized, but simply makingser
volatile and using a local variable is enough to fix it.The second is in
FSTObjectOutput.close()
. Here, the codec is modified after it has been returned to the public configury.This kind of error is trivially detectable if you null out any references you hold to an object after returning it, so I've done that everywhere. Also, this means that there is less work for the garbage collector to do, and we all should be nice to our garbage collectors.