Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create OpenBabelReader to convert OpenBabel OBMol to MDAnalysis AtomGroup #5

Open
lunamorrow opened this issue May 31, 2024 · 10 comments
Assignees

Comments

@lunamorrow
Copy link
Collaborator

The first step of this OpenBabel converter will be to convert OpenBabel OBMols to MDAnalysis AtomGroups. This will enable the indirect parsing of over 100 file types into a format that MDAnalysis tools can analyse.

The OpenBabelReader will take an OBMol and correctly convert it to an AtomGroup. This Class will need to account for different attributes in OBMol objects formed from different file types, and will exploit the OpenBabel python wrappers for easy access of attributes. The resulting AtomGroup can be analysed as is, or assigned to a Topology or a Residue/Segment.

During the creation of this converter class, I will be reaching out to active OpenBabel contributors to gain advice and input about how best to develop it.

For more information and suggested implementation please see GSoC Project.

@hmacdope
Copy link
Member

hmacdope commented Jun 2, 2024

@lunamorrow @cbouy you have here to an AtomGroup.

What you probably want is to a Universe no? See example in RDKit reader here: https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/converters/RDKit.py#L35C1-L47C53

Direct to AtomGroup is probably not what you want.

@hmacdope
Copy link
Member

hmacdope commented Jun 2, 2024

Important in this as well is that RDKitReader is a subclass of MemoryReader

@cbouysset
Copy link
Collaborator

Just to make sure we all are on the same page in case there's any misunderstanding on the goal of the different classes that are set up for converters:

  • the Parser creates a topology from the "foreign" object (here an openbabel mol). The class should inherit from TopologyReaderBase and define a parse method that returns a Topology with all the atom-level and residue-level attributes. For historical reasons the attribute under which the foreign object is available is self.filename.
  • the Reader reads a trajectory. In the case of an openbabel mol that means parsing the coordinates from each conformer. Because OBabel is not really meant to process huge files it's fine to assume everything will fit in memory hence the use of the MemoryReader as a base class.
  • both Reader and Parser combined will automagically allow you to create a Universe with u = mda.Universe(obmol)
  • the Converter does the opposite step from the above, i.e. convert an AtomGroup or Universe to a foreign object. You can directly inherit from ConverterBase and define a convert method, which can then be automagically used with obmol = my_atomgroup.convert_to.openbabel(<optional parameters>)

Hope this helps!

@lunamorrow
Copy link
Collaborator Author

Ahhh ok, thanks @hmacdope and @exs-cbouy. I was planning to have the Parser make a Universe, and the Reader an AtomGroup but I see the redundancy now. What you've said makes sense @exs-cbouy, as I need to have the topology and the positions/trajectory to create a Universe. I just had a quick look at documentation and it appears that MemoryReader is for topologies with a Trajectory, while SingleFrameReaderBase is for topologies with just one position set. The only trajectory accepted by OpenBabel seems to be xtc, which MDAnalysis already takes. I assume it is best practice to inherit from MemoryReader though so that the converter can capture all possible info? I'll change that over now.

I suspect it would be best for me to start on the Parser' before the Reader` too. What would you suggest @exs-cbouy, seeing as you have done it before?

@cbouysset
Copy link
Collaborator

I haven't used openbabel much but I'm guessing it can store coordinates for each conformer on the same molecule object (like the RDKit does), in which case the MemoryReader makes sense (since you won't always have a single set of coordinates for a given molecule).

Yes I would suggest doing the Parser before, I don't remember if you really need the Reader to start playing around and constructing a Universe from an openbabel mol, but worst case scenario you could just use dummy coordinates in the Reader to begin with.

@hmacdope
Copy link
Member

hmacdope commented Jun 6, 2024

To clarify this further, @lunamorrow by trajectory here we just mean "any set of coordinate data" which much be present in ANY format, not just that with more than one frame or a traditional MD format like xtc. For example, using the MemoryReader you can make a trajectory from a raw numpy array. You will conceptually at least do the same but after extracting the data from Obabel

@lunamorrow
Copy link
Collaborator Author

I'm guessing it can store coordinates for each conformer on the same molecule object

Yes it appears so, I will double check their API to be safe.

Yes I would suggest doing the Parser before,

Great I'll get going on that first then

To clarify this further, @lunamorrow by trajectory here we just mean "any set of coordinate data" which much be present in ANY format, not just that with more than one frame or a traditional MD format like xtc. For example, using the MemoryReader you can make a trajectory from a raw numpy array. You will conceptually at least do the same but after extracting the data from Obabel

Thanks for the clarification @hmacdope! I didn't know you could just feed in a numpy array too, that is really cool.

@lunamorrow
Copy link
Collaborator Author

The Parser (to convert atom attributes) component of the conversion from OBMol to Universe is done so I am now starting on the Reader to convert the positions and trajectory.

@lunamorrow
Copy link
Collaborator Author

All done, as per #16

@lunamorrow lunamorrow reopened this Aug 2, 2024
@lunamorrow
Copy link
Collaborator Author

Oops closed the wrong issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants