-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration of CDK ExhaustiveFragmenter #133
base: production
Are you sure you want to change the base?
Conversation
…fragmentation utility of the CDK, as a fragmentation option
… of the minimal fragment size of the exhaustive fragmenter
…f the exhaustive fragmentation and added the respective test class
…ue to automatic saturation
@ToLeWeiss , I fear performance will be a big issue here. I just started a fragmentation with 1000 natural products and it is still running. We need to discuss this, test it more, and maybe document it somewhere. Final result of my run:
And afterwards, it aborted with an out of memory error... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just some minor things in the comments.
if (!this.shouldBePreprocessed(aMolecule)) { | ||
return aMolecule.clone(); | ||
} | ||
return aMolecule.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if statement is unnecessary. I guess you wanted to leave room between the if statement and the return statement for actual preprocessing should you need to add it later? If yes, please add a comment, at least. Or simply remove the if.
* <a href="https://cdk.github.io/cdk/latest/docs/api/org/openscience/cdk/fragment/ExhaustiveFragmenter.html"> | ||
* exhaustive fragmentation | ||
* </a> | ||
* from the CDK, available for MORTAR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary comma
/** | ||
* The name of the algorithm used for fragmentation. | ||
*/ | ||
private static final String ALGORITHM_NAME = "Exhaustive Fragmenter"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two should be public.
this.settingNameTooltipTextMap = new HashMap<>(tmpNumberOfSettings, | ||
BasicDefinitions.DEFAULT_HASH_COLLECTION_LOAD_FACTOR); | ||
this.settingNameDisplayNameMap = new HashMap<>(tmpNumberOfSettings, | ||
BasicDefinitions.DEFAULT_HASH_COLLECTION_LOAD_FACTOR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use CollectionUtil.calculateInitialHashCollectionCapacity() or HashMap.newHashMap().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now that you only have one setting but are initialising the maps with a capacity of 2, ok. But for the future, please use the ways I indicated.
this.settingNameDisplayNameMap = new HashMap<>(tmpNumberOfSettings, | ||
BasicDefinitions.DEFAULT_HASH_COLLECTION_LOAD_FACTOR); | ||
this.cdkEFInstance = new ExhaustiveFragmenter(); | ||
this.minimumFragmentSize = new SimpleIntegerProperty(this, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put -Setting at the end of this variable or -Preference. Otherwise, one might expect an integer here.
} | ||
//</editor-fold> | ||
IAtomContainer tmpMoleculeClone = aMolecule.clone(); | ||
List<IAtomContainer> tmpFragments = new ArrayList<>(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think of a more suiting default initial size, like maybe the number of bonds in the molecule times two?
IAtomContainer tmpMoleculeClone = aMolecule.clone(); | ||
List<IAtomContainer> tmpFragments = new ArrayList<>(0); | ||
try { | ||
SmilesParser tmpSmilesParser = new SmilesParser(DefaultChemObjectBuilder.getInstance()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use SilentChemObjectBuilder
SmilesParser tmpSmilesParser = new SmilesParser(DefaultChemObjectBuilder.getInstance()); | ||
this.cdkEFInstance.generateFragments(tmpMoleculeClone); | ||
// there is also an option to extract atom containers directly with getFragmentsAsContainers but this oversaturates | ||
// fragments described in this issue https://github.com/cdk/cdk/issues/1119. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good that you put this here but please flag it with TODO:
IAtomContainer tmpOriginalMolecule; | ||
List<IAtomContainer> tmpFragmentList; | ||
CDKExhaustiveFragmenter tmpFragmenter = new CDKExhaustiveFragmenter(); | ||
tmpFragmenter.setFragmentSaturationSetting(IMoleculeFragmenter.FragmentSaturationOption.HYDROGEN_SATURATION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not make much sense, given that you deactivated that feature.
|
||
@Override | ||
public FragmentSaturationOption getFragmentSaturationSetting() { | ||
//TODO: there is currently no possibility to implement saturation settings for the exhaustive fragmenter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe this in more detail here like you did below?
And the way to do it here would be to throw an UnsupportedOperationException in these three methods. But please test whether this impacts running MORTAR. Since you do not add the property to the list of settings, we should be fine.
When the new fragmenter is added, we also need to describe it in the tutorial: https://docs.google.com/document/d/1dbpSZfdmOYaQqdYTDZj0NfKuy9TMp6T4SF6lCBB7bFo/edit?usp=sharing |
…xhaustive fragmenter class
@ToLeWeiss ready for re-review? |
Quality Gate failedFailed conditions |
Now it is ready for a review. |
No description provided.