-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iterator crashes JVM on closed database (RocksJava) #5234
Comments
+1 I don't think it's unreasonable for a parent to have to keep track of its child resources, and doing so would lead to less surprises and a more flat learning curve. There is a non-trivial overhead with the JNI library and the amount of copying it does as is; so adding an "if" statement, where the condition is relatively constant (and predictable after a few runs), would not add that much overhead... the cost of a couple potential branch mispredictions pales in comparison to cache misses that are inevitable under a randomized workload, and, more importantly, with time lost by the individuals debugging the segfaults, in all their opaque glory. An exception to this is when ownership of a resource is explicitly transferred. Both sides have to agree on it though... a case where they do not agree- you can pass a Slice to ReadOptions and ReadOptions will retain the slice... but when you use it to create an iterator in a separate lexical scope, it may be finalized and segfault the process. |
The book-keeping required would not be insignificant. We would need to add an atomic check to every function call. |
Are you concerned with the number of changes or the potential cost of the volatile reads? For the former, you could easily do this by refactoring through IntelliJ. You can encapsulate access to the nativeHandle_ field through a method, and verify the pointer hasn’t been disposed in its body. It wouldn’t prevent abend alone, you would need a reader-writer semaphore with nested lifecycle management for that. I imagine this wouldn’t be too difficult to add through a Since the majority of access is shared, you could optimize for this case by using some form of a spin lock, with a yielding/sleeping wait strategy for the exclusive lock. And if there’s contention on the readers, you can stripe the counters and separate “acquires” from “releases” (exclusive lock waits for their sums to be equal). But that should come after trying out ReentrantReadWriteLock and measuring the overhead with common operations. This won’t address cases where an object allocated in a different lexical scope becomes unreachable and gets GC’d, but I would argue that’s out of scope for synchronizing the life cycle of RocksDB with RocksIterator, which is the most common pain point. For the latter (performance), uncontended atomic reads on modern x64 should be nothing more than a read on L1. You can also disable the checks altogether with a system property on a static final field, and create noop implementations of the Monitor. Oracle’s C2 compiler (server) will recognize and eliminate the dead code. For client and other vendor VMs and other CPU architectures, I have no immediate insight. But I expect the overhead would pale in comparison to the costs incurred by the JNI impl - GetByteArrayElements, for example, does a “malloc” for non empty arrays and copies the memory. Barring the adoption of these changes, there are some simpler changes that would make it feasible for someone to extend your classes to create a safe implementation. Specifically by adding factory classes/methods for creating the RocksDB and RocksIterator instances, or adding a method to transfer ownership of a handle from one object to another. |
I am concerned about the overhead of doing such a check on effectively every operation that RocksDB does. Whilst I understand that for one operation the overhead is low, once you scale this up to millions of operations, I doubt that it is still insignificant. |
You can scale it up to billions and the overhead would still be trivial. Processes will not contend on reads when there are no writers (memory-wise) on amd64/x86-64 architectures (Intel SDM confirms, as does running with Also, using AtomicBoolean adds a level of indirection. Using class AbstractImmutableNativeReference {
private static final AtomicIntegerFieldUpdater<AbstractImmutableNativeReference> UPDATER =
AtomicIntegerFieldUpdater.newUpdater(AbstractImmutableNativeReference.class, "owningHandle_");
private volatile int owningHandle_;
protected void finalize() {
if (owningHandle_ == 1) {
// log a warning
}
}
public boolean isOwningHandle() { return owningHandle_ != 0; }
protected final void disOwnNativeHandle() {
if (!OWNING_HANDLE.compareAndSet(this, 1, 0)) {
throw new IllegalStateException("Cannot transfer ownership of something you don't own.");
}
}
public final void close() {
if (OWNING_HANDLE.compareAndSet(this, 1, 0)) {
disposeInternal();
}
}
} Here, only In C++, when you delete a pointer (and the reference exists outside the lexical scope), you would typically assign As I noted before, you can always use a I'm unfamiliar with ARM or with Android's VM, but I would find it extremely surprising if a single check is more expensive than the JNI overhead (copying to conform to C calling conventions, and adding a safepoint poll after the call), and the actual work done in the method. Searching a memory table acquires a spinlock, which falls back to yielding to the scheduler... this can't possibly be more expensive than a read from memory. If L1 caches become more prevalent, it could be the object layout. |
In an analogous way to owning column family handles, and closing them when the DB itself is closed, we make iterators be “child” objects of the DB. Iterator lifecycle is generally shorter than DB, so we give them a lambda to remove themselves from the DB on iterator close(). Add a few tests to check that iterators which remain open when the DB is requested to close, are now closed when the DB is closed. TODO - other DB subclasses need to add iterators to the list TODO - there are cases where isOwningHandle() should not throw an exception; we need a less sledgehammer way of implementing this TODO - fix the current Java tests TODO - the atomic improvement in facebook#5234
In an analogous way to owning column family handles, and closing them when the DB itself is closed, we make iterators be “child” objects of the DB. Iterator lifecycle is generally shorter than DB, so we give them a lambda to remove themselves from the DB on iterator close(). Add a few tests to check that iterators which remain open when the DB is requested to close, are now closed when the DB is closed. TODO - other DB subclasses need to add iterators to the list TODO - there are cases where isOwningHandle() should not throw an exception; we need a less sledgehammer way of implementing this TODO - fix the current Java tests TODO - the atomic improvement in facebook#5234
In an analogous way to owning column family handles, and closing them when the DB itself is closed, we make iterators be “child” objects of the DB. Iterator lifecycle is generally shorter than DB, so we give them a lambda to remove themselves from the DB on iterator close(). Add a few tests to check that iterators which remain open when the DB is requested to close, are now closed when the DB is closed. TODO - other DB subclasses need to add iterators to the list TODO - there are cases where isOwningHandle() should not throw an exception; we need a less sledgehammer way of implementing this TODO - fix the current Java tests TODO - the atomic improvement in facebook#5234
In an analogous way to owning column family handles, and closing them when the DB itself is closed, we make iterators be “child” objects of the DB. Iterator lifecycle is generally shorter than DB, so we give them a lambda to remove themselves from the DB on iterator close(). Add a few tests to check that iterators which remain open when the DB is requested to close, are now closed when the DB is closed. TODO - other DB subclasses need to add iterators to the list TODO - there are cases where isOwningHandle() should not throw an exception; we need a less sledgehammer way of implementing this TODO - fix the current Java tests TODO - the atomic improvement in facebook#5234
This bug report is for RocksJava JNI wrapper (any recent version).
After closing the database object, if we try to read an iterator associated with the closed database, the JVM crashes. I'm not sure this is by design or the expected behavior would be a null or something like an
IllegalStateException
. In my application, I call the iterator'sisValid()
method right before reading the iterator, but the JVM crash still happens because of a race condition where the database is closed in another thread.Expected behavior
Throw an
IllegalStateException
? Returnnull
? Or users are expected to always synchronize all iterator accesses on the RocksDB database object?Actual behavior
JVM crashes:
Steps to reproduce the behavior
The following test case crashes the JVM.
The text was updated successfully, but these errors were encountered: