-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pdb_selaltloc removes residues with (only) alternate locations #153
Comments
Thanks for this @mgiulini |
thank you @joaomcteixeira |
Hi, sorry for the delay in addressing this issue. I have been a bit overwhelmed on other fronts. I will address it in the next few weeks. It seems something that needs to be inspected carefully. |
hi Joao! no worries, I could actually try to give a look into it by myself, as I would like to use |
This is a serious issue that should be solved.
|
I'll look into it! |
Go for it. Look carefully at the way @JoaoRodrigues and I designed the tests. I suggest you first write the tests for this new case and then try to correct the code for that. Be sure first that the PDB is formatted well. Some issues with PDBs are unsolvable because the PDB itself is wrongly created. |
I understood the problem, but I think it would take ages for me to solve it without breaking the code, as the algorithm is very complicated. the problem occurs when pdb-tools/pdbtools/pdb_selaltloc.py Lines 302 to 318 in 2a070bb
You can clearly see from the code from the code that if the input dictionary
according to the function, the first key (' ') is selected and the others are simply ignored. Possible solutions could be:
I tried the second option, but it broke the tests.. |
I would say the first option sounds better. No problem accommodating the Also, feel free to adventure yourself rewriting the inner code if you feel so 😈 . As long as the already implemented test cases pass and the API itself is kept. But first, i suggest trying the if-clause to see what happens. Good luck and many thanks |
Thanks for all this input Marco. Addressing it in steps: 1)
I revisited the tests, and they are correct to my knowledge on altlocs.
In this way, 2) There are several errors associated with this PDB when using This PDB has an My question: Is this a correct altloc formatting? @JoaoRodrigues @amjjbonvin need your help with this. If yes, I will need to rewrite the script considerably. It's not a problem, but I want to double-check with you before doing it. Cheers! |
This is a weird one… But if it comes from the PDB then we should ideally handle it.
… My question: Is this a correct altloc formatting? @JoaoRodrigues <https://github.com/JoaoRodrigues> @amjjbonvin <https://github.com/amjjbonvin> need your help with this.
|
As usual, I went to bed and woke up thinking about it. I am tempted to say it is complicated (or even impossible) to cover this scenario in the current I am tempted to rewrite the whole script and return it to something similar to the original implementation: first collect, then flush; contrarily, to flush-on-the-fly as it is now. Obviously, considering all the new tests and edge cases encountered in the last years that contributed to this change in the algorithm. I will keep you posted. I will work on this today. |
Hi there! not sure I agree about this: ILE 25 does not show any alternate locations, right? in my opinion it should be kept in the pdb by
have you checked my solution in #154? probably not super elegant, but it might be a good starting point |
Is there? Or are both LEU25 and ILE25 at the same position in space? This calls for some visual inspection
… Hi there! not sure I agree about this: ILE 25 does not show any alternate locations, right? in my opinion it should be kept in the pdb by pdb_selaltloc, since in this case there's probably only a numbering issue..there're examples where two
|
It is there indeed (ILE25 and LEU25 are almost superimposed), but in the original file https://files.rcsb.org/view/3U7T.pdb ILE25 has the alternate location label A, so basically there are AILE and BLEU at position 25. It is in this case that it makes sense to select only one of them. |
We are talking about the test case The way I see it, pdb-tools/tests/data/dummy_altloc.pdb Lines 34 to 49 in 2a070bb
It is the same as for
Yes, I saw. Thanks. But it's not clean in the sense of the implemented algorithm because it is a patch code trying to overcome a problem at the |
here is the question: should we consider the simple space
sure, as you wish |
I think there's some confusion about the heterogeneity possible in PDB
files. Altlocs, specifically and by design, are meant to represent multiple
instances of the same atom (same chain, resnum, inscode, atom name) with
different occupancies. Indeed, the spec says that if there are multiple
altlocs, they all should have non-blank altloc letters, but this is clearly
not the case in several files in the wild.
In this case specifically that you mention Marco, I would consider these
two instances of the same residue. Basically, my guess is that the authors
couldn't figure out it this is a ILE or a LEU and decided to put both
possibilities in.
My solution for this is to treat those ILE and LEU as different
conformations of the same residue (they do have partial occupancies that
add to 1) and simply pick the one with highest occupancy. If you abuse the
format, well, tough luck.. We should identify residues as (chain, resnum,
inscode), which is what the spec says.
I'll chat with João offline about a good implementation.
|
Hi Joao, yes, your solution makes sense..if the user submits a pdb with two amino acids having same chain-resnum..well, there's nothing you can do about it |
We are having some discussions on slack, but this one is relevant to register here: In the case PDB @mgiulini sent. For example, for
|
Pick the first.
… Message ID: ***@***.***>
|
Simply select the 1st one
|
Describe the bug
pdb_selaltloc removes residues in which all atoms show alternate locations
To Reproduce
Expected behavior
pdb_selaltloc
should not delete the residue but rather pick one of the two available alternate locationsDesktop (please complete the following information):
pdb4xoj.pdb.zip
The text was updated successfully, but these errors were encountered: