-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AlkylStructureFragmenter Implementation #10
base: production
Are you sure you want to change the base?
Conversation
…rst algorithm setting
…rst algorithm setting
…MORTAR into dev-Max-alkylfragmentation
… duplicate rings;
Quality Gate failedFailed conditions |
for (IAtomContainer tmpAC : anACSet.atomContainers()) { | ||
AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(tmpAC); | ||
tmpAdder.addImplicitHydrogens(tmpAC); | ||
Kekulization.kekulize(tmpAC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the kekulization of the resulting fragments after successful fragmentation leads to an exception originating from unset implicit hydrogens, even though they should be added by the CDKHydrogenAdder before starting kekulization. The kekulization of the expected fragments does not throw any exceptions as the aromaticity is already set by directly reading from a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, you have two fragments of SMILES code "*C(*)*"
in your resulting fragments set. I have looked at the detected atom types and their implicit hydrogen counts. Somehow, I don't know how, they are treated differently here. In the first structure, every R-/pseudo-atom has a hydrogen count of 0, and in the second structure, it is "null", causing the issue.But the CDKHydrogenAdder should ignore all of the pseudo atoms (see here: https://github.com/cdk/cdk/blob/01d7ab0ea4f5f0f590b42cbb2c45e32e2cf41066/base/valencycheck/src/main/java/org/openscience/cdk/tools/CDKHydrogenAdder.java#L98). An interesting bug or feature: if you use the atom-specific method, it does not check for pseudo-atoms and treats them "right", i.e. assigns an implicit hydrogen count of 0 to them. Then, your test does not throw an exception but the assertions fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kekulization of the expected fragments does not throw any exceptions as the aromaticity is already set by directly reading from a file.
Partly true, kekulization of the expected fragments is successful because explicit bond orders can be assigned correctly. We can discuss the theoretical background on Friday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked at the detected atom types and their implicit hydrogen counts. Somehow, I don't know how, they are treated differently here. In the first structure, every R-/pseudo-atom has a hydrogen count of 0, and in the second structure, it is "null", causing the issue.
I saw that too and was very confused by it.
An interesting bug or feature: if you use the atom-specific method, it does not check for pseudo-atoms and treats them "right", i.e. assigns an implicit hydrogen count of 0 to them.
I will test this too in the coming days.
We can discuss the theoretical background on Friday.
Gladly, I would like to know the underlying theory.
Quality Gate failedFailed conditions |
@Mila1004, when you have time, please merge the current production branch into your branch and adjust your code to the changes made there in the meantime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete review, will continue tomorrow.
/** | ||
* Conjugated Pi System Fragmenter | ||
*/ | ||
private IMoleculeFragmenter ConjPiSysF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two new fragmenter variables should go to the final class constants like the others.
TODO: (01|08|24): -implement spiro carbon detection + ring integrity setting | ||
-overhaul test class (with new functionalities) | ||
-large scale testing of fragmentation (regarding correct fragmentation and performance) | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finish to dos or move them to GH issues or discussion
//</editor-fold> | ||
// | ||
//<editor-fold desc="Public Properties Get"> | ||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code fold is not working for me (some others as well). It helps to add a line break between the fold declaration and the "'at'Override".
/** | ||
* Default boolean value for determining which single ring detection method should be used. | ||
*/ | ||
public static final boolean ALTERNATIVE_SINGLE_RING_DETECTION_SETTING_DEFAULT = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to discuss the setting naming
/** | ||
* A property that has a constant boolean value defining if rings should be dissected further. | ||
*/ | ||
private final SimpleBooleanProperty keepRingsSetting; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all these settings/properties used now? Why the "constant value" statement?
} | ||
} | ||
} catch (Exception anException) { | ||
AlkylStructureFragmenter.this.logger.log(Level.WARNING, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the logger is static, so there should be no "this" inbetween.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I will stop commenting "no need to log" now.
} else { | ||
//<editor-fold desc="RingSearch (isolated)"> | ||
//for spiro detection: set array property for each atom belonging to x rings | ||
//--> if more than one ring --> check for connected atoms to determine if spiro config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the setting switches between detecting isolated rings or detecting all rings again using MCB? Does this actually make a difference?
IAtomContainer tmpIsolatedMultiBondsContainer = new AtomContainer(); | ||
IAtomContainer tmpTertQuatCarbonContainer = new AtomContainer(); | ||
IAtomContainer tmpSpiroCarbonContainer = new AtomContainer(); | ||
IAtomContainer tmpSpiroResidueContainer = new AtomContainer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spiro residue container is unused
if (!((boolean) tmpAtom.getProperty(AlkylStructureFragmenter.INTERNAL_ASF_RING_MARKER_KEY) | ||
|| (boolean) tmpAtom.getProperty(AlkylStructureFragmenter.INTERNAL_ASF_CONJ_PI_MARKER_KEY))) { | ||
if (tmpAtom.getProperty(AlkylStructureFragmenter.INTERNAL_ASF_TERTIARY_CARBON_PROPERTY_KEY)) { | ||
tmpRingFragmentationContainer.addAtom(tmpAtom); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use the ring fragmentation container here when you have a dedicated container for tertiary and quaternary carbons?
if (!this.keepRingsSetting.get()) { | ||
ArrayList<Integer> tmpList = tmpAtom.getProperty(AlkylStructureFragmenter.INTERNAL_ASF_RING_ATOM_LIST_KEY); | ||
if (tmpList.size() > 1) { | ||
tmpSpiroCarbonContainer.addAtom(tmpAtom); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if the atom is part of multiple isolated rings, it must be a spiro carbon. But what if the mcb ring detection is used?
@Mila1004 where did you get the example molecules in the MOL and SD files from? There should be a source/reference given somewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the fragmenter, it's mostly small things but the test class needs to be improved a lot.
} | ||
ArrayList<Integer> tmpEndList = tmpEndAtom.getProperty(AlkylStructureFragmenter.INTERNAL_ASF_RING_ATOM_LIST_KEY); | ||
if (tmpEndList.size() > 1) { | ||
tmpIsEndSpiro = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
//check maxchainlength | ||
switch (tmpMaxChainLengthInteger) { | ||
default -> { | ||
//restrictions > 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(style) default case should be the last one in a switch
default -> { | ||
//restrictions > 1 | ||
for (IAtomContainer tmpAtomContainer : tmpChainACSet.atomContainers()) { | ||
tmpDissectedChainACSet.add(this.separateDisconnectedStructures(this.dissectLinearChain(tmpAtomContainer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separating method calls into multiple lines makes it easier to debug.
case 0 -> { | ||
//if int maxChainLength gives 0 throw IllegalArgumentException | ||
//not happy with how this works, preferably a gui warning would be nice | ||
AlkylStructureFragmenter.this.logger.log(Level.WARNING, "Illegal restriction argument", new IllegalArgumentException()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be defined in the constructor where you initialise the setting properties. Check the other fragmenters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the default value is 0, right? Defaults should not raise exceptions but should be generally well-suited valaues!
IAtomContainer tmpReturnAC = new AtomContainer(); | ||
//starts at 1 for usability, see aLength: 1on1 translation of input to counter | ||
int tmpCounter = 1; | ||
Iterator<IBond> tmpBondIterator = anAC.bonds().iterator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you really rely on the iterator starting at one end of the chain, traversing along it, and ending at the other end?
} catch (FileNotFoundException e) { | ||
throw new RuntimeException(e); | ||
} catch (URISyntaxException e) { | ||
throw new RuntimeException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, no need to catch. I will not comment this down below.
|
||
//<editor-fold desc="@Test Custom Methods"> | ||
@Test | ||
public void detectRingsWithMCBTest() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test does n your functionality but the cycle finder.
|
||
} | ||
@Test | ||
public void detectRingsWithRingSearchTest() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, tests of the used CDK functionalities do not belong here.
return tmpACSet; | ||
} | ||
//</editor-fold> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As stated in the comment above from the last review, this test class needs more step-by-step examples with actual molecules (check the SugarRemovalUtility test).
tmpFragmentList = tmpFragmenter.fragmentMolecule(tmpOriginalMolecule); | ||
SmilesGenerator tmpGenerator = new SmilesGenerator(SmiFlavor.Canonical); | ||
for (IAtomContainer tmpFragment : tmpFragmentList) { | ||
System.out.println(tmpGenerator.create(tmpFragment) + " " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert instead of print.
Quality Gate failedFailed conditions |
First steps for a complete alkyl structure fragmentation.
Currently only contains fragmentation logic for cyclic structures and non-cyclic side-chains.