-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for symmetries #15
Comments
Thanks for the kind words - do let me know of any other issues you run into or functionality you would like.
I thought about this, but am not sure how rigid/fixed this information is. I could add a better error message to
You are correct in saying that PDB headers contain the symmetry information, but this information is also in the mmCIF dictionary. See for example https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/biological-assemblies#Anchor-Biol. You could get the values using something like: using BioStructures
downloadpdb("3C70", file_format=MMCIF)
d = MMCIFDict("3C70.cif")
d["_pdbx_struct_oper_list.matrix[1][1]"] # etc If you want your parser to work on all PDB entries that have that information, you could parse from the mmCIF dictionary. The primacy of the mmCIF file is one reason BioStructures doesn't do any PDB header parsing.
Thankfully, if you update to BioStructures v0.8.0 you will be able to use recently-added transformation functions to help with this. See the docstrings for If you do come up with something that works, let me know and I can add an example/explanation to the documentation. |
Thank you for all the information! I'll update and begin working with the mmCIF data, and have a look at the transformation functionality. Thanks again for a great library! |
This is a great library and I'm already using it lots. I would like to evaluate positions of a protein's atoms with the full expression of symmetry in the complete unit cell or entire crystal. I can't seem to find it in the documentation for this library, nor in the code.
Expected Behavior
I would like to see documented examples of how to retrieve that atoms positions along with all the symmetries of those positions.
Current Behavior
I've written my own basic parser (below), but it only works for PDB files. Molecules in the PDB can often be dowloaded in either PDB format or MMCIF format, however some appear to only be downloadable as mmCIF, e.g. "6PEM". In these cases, downloadpdb throws an exception unless the MMCIF format is specified.
Possible Solution / Implementation
Perhaps two functions be added to the library. One function that can inform the user which filetypes a protein can be downloaded in, and another function that allows the user to read symmetries regardless of the file type. However there may be more natural solutions that integrate better with the BioStructures.Model (e.g. adding a field for parsed symmetries). I would be happy with anything that supported both formats and allowed me to ultimately get all the atoms in the unit cell or entire crystal.
By way of example of the second function, the code that I implemented as a workaround follows. It assumes that you have already downloaded a PDB file to a pdb_cache_directory, and it only works with .pdb format files.
Your Environment
The text was updated successfully, but these errors were encountered: