-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VIVO 3828: Remove SDB in preparation to moving to new Jena. #374
base: jena4-upgrade
Are you sure you want to change the base?
Conversation
The Jena version being switched to (Jena 4) has removed support for SDB. The current forms of `DatasetWrapperFactory` and `StaticDatasetFactory` should no longer be needed. The SDB related code has been stripped out. Many of the classes with "SDB" in the name that actually provide more than just SDB have been re-created. These recreations are generally child classes of a similarly named Jena class. These recreates have "DB" in their name rather than "SDB". The `DatasetFactory.createMem()` is deprecated and may not be available in Jena 4. Attempts to replace this with `createTxnMem()` have revealed problems that are potentially transaction related. Replacing with `createGeneral()` might be possible but this is not done to avoid introducing more potential problems. Notable points in regards to replacing `DatasetFactory.createMem()`: 1) The method is deprecated. 2) The `DatasetFactory.createGeneral()` is the compatible equivalent. 3) The `DatasetFactory.createGeneral()` better supports TDB (which is the main reason for including in this commit set). 4) The `DatasetFactory.createTxnMem()` is the more recommended alternative and is transactional. 5) This better prepares the code for the upcoming Jena upgrade changes. There are some serious existing problems with closing dataset connections (and related). The (now removed) `DatasetWrapperFactory` probably should be rewritten to provide a new dataset each time rather than copying an existing one. The `close()` functionality appears to not be being called for TDB due to the SDB conditions. With this SDB condition the connections are now being closed. This has exposed several problems that both tests and runtime expose. This problem is considered out of scope for the changes and the close code is commented out with a FIXME statement. The documentation for `example.runtime.properties` refers to `VitroConnection.DataSource.*` only being used by SDB but this is incorrect. The OpenSocial code appears to talk directly to an SQL database using these properties. Update the documentation in this regard and replace the references to SDB with OpenSocial. Remove no longer necessary SDB helper code that is added to TDB as well because it "shouldn't hurt". Remove much of the documentation and installation functionality using or referencing SDB. The `IndividualDaoSDB.java` file is implemting `getAllIndividualUris()` and `getUpdatedSinceIterator()` via the now removed `IndivdiualSDB` class. This functionality is used by `RebuildIndexTask` and `UpdateUrisTask` classes. The new *DB classes now have java docs. These java docs have a bare-bones level of detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaladay thanks a lot for this really great PR. I have posted three small comments/suggestions. Also, I was wondering for a moment whether some *DB files should be continuation of *SDB files (by using mvn rm), but it looks as it is not so similar, it is more re-creation and just extraction of smaller part of SDB classes, therefore no needs for any change there.
api/src/main/java/edu/cornell/mannlib/vitro/webapp/dao/ObjectPropertyStatementDao.java
Outdated
Show resolved
Hide resolved
// FIXME: This is causing NPEs and connection already closed errors in tests. | ||
// Not closing the dataset can result in excessive memory or other resource usage. | ||
/*if (!closed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it means multiple instances of DatasetWrapper class are used in VIVO, and those instances share the same Dataset instance, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is my understanding of the situation.
In addition to that, I suspect some of the problems are possibly race conditions as well.
The query execution may close its connection but it might do so after the dataset has been closed. This was observed by adding a system out print statement before the query execution and the NPE stopped happening.
api/src/main/java/edu/cornell/mannlib/vitro/webapp/dao/jena/IndividualDaoJena.java
Outdated
Show resolved
Hide resolved
…ropertyStatementDao.java Co-authored-by: Dragan Ivanovic <[email protected]>
…dividualDaoJena.java Co-authored-by: Dragan Ivanovic <[email protected]>
VIVO GitHub issue: vivo-project/VIVO#3828
What does this pull request do?
The Jena version being switched to (Jena 4) has removed support for SDB.
The current forms of
DatasetWrapperFactory
andStaticDatasetFactory
should no longer be needed.The SDB related code has been stripped out.
Many of the classes with "SDB" in the name that actually provide more than just SDB have been re-created.
These recreations are generally child classes of a similarly named Jena class.
These recreates have "DB" in their name rather than "SDB".
The
DatasetFactory.createMem()
is deprecated and may not be available in Jena 4.Attempts to replace this with
createTxnMem()
have revealed problems that are potentially transaction related.Replacing with
createGeneral()
might be possible but this is not done to avoid introducing more potential problems.Notable points in regards to replacing
DatasetFactory.createMem()
:DatasetFactory.createGeneral()
is the compatible equivalent.DatasetFactory.createGeneral()
better supports TDB (which is the main reason for including in this commit set).DatasetFactory.createTxnMem()
is the more recommended alternative and is transactional.There are some serious existing problems with closing dataset connections (and related).
The (now removed)
DatasetWrapperFactory
probably should be rewritten to provide a new dataset each time rather than copying an existing one.The
close()
functionality appears to not be being called for TDB due to the SDB conditions.With this SDB condition remove the connections would the become closed when closed() is called.
This has exposed several problems that both tests and runtime expose.
This problem is considered out of scope for the changes and the close code is commented out with a FIXME statement.
The documentation for
example.runtime.properties
refers toVitroConnection.DataSource.*
only being used by SDB but this is incorrect.The OpenSocial code appears to talk directly to an SQL database using these properties.
Update the documentation in this regard and replace the references to SDB with OpenSocial.
Remove no longer necessary SDB helper code that is added to TDB as well because it "shouldn't hurt".
Remove much of the documentation and installation functionality using or referencing SDB.
The
IndividualDaoSDB.java
file is implemtinggetAllIndividualUris()
andgetUpdatedSinceIterator()
via the now removedIndivdiualSDB
class.This functionality is used by
RebuildIndexTask
andUpdateUrisTask
classes.The new *DB classes now have java docs.
These java docs have a bare-bones level of detail.
How should this be tested?
Load the i18n data and try to use via the web interface.
Additional Notes:
The
DatasetWrapperFactory
that is removed in this PR is a factory ofDatasetWrapper
as described. However, theDatasetWrapper
being passed has a local reference of theDataset
. The sameDataset
, as far as I can tell, is being passed around. Creating a newDatasetWrapper
for the sameDataset
does not seem helpful when if that sameDataset
is closed anywhere, all of the differentDatasetWrapper
instantiations holding a reference to that sameDataset
will in actuality could become closed even though their respective wrapper does not necessirly have the closed boolean set.This behavior is a bit confusing to me so I removed the implementation as I described above under the "What does this pull request do?" section. If this is doing something I am not seeing or undersanding, then the
DatasetWrapperFactory
could be restored. I think it would be worth while to consider creating/instantiating a newDataset
when the a newDatasetWrapper
is created via theDatasetWrapperFactory
. This would possible be of help in resolving the closure problems described above.The
StaticDatasetFactory
is confusing to me. This class appears to provide a non-static instance of aDataset
being shared globally and is treated as if it were static. I am uncertain on the implications of this design decision.I described above about the deprecation of
DatasetFactory.createMem();
.Further investigation has revealed that
DatasetFactory.createMem();
is in fact removed as of Jena 4.0.0 and greater. The migration of that method call will be performed in the subsequent Jena migration related PRs.Interested parties
@brianjlowe
@vivo-project/vivo-committers