-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve lovd_getVariantInfo()
and lovd_fixHGVS()
#574
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
NOTE: This is test-driven development; many of these tests currently fail. The goal of this branch is to improve the function such, that the tests will no longer fail.
To make getVariantInfo simpler and easier to use, the two regular expressions have been pulled into one. Secondly, the priority has been put on finding the positions, so that variants will be sorted accordingly even if the variant as a whole does not follow HGVS syntax and is therefore ambiguous or implausable. Thirdly, more warning messages are being returned to help the user understand where the problems in their variant description lie.
This new commit fixes the bugs in getVariantInfo and adds multiple functionalities. The function can now deal with repeats, and recognises when suffixes are rightly given to variants of a type other than ins or delins. Most of all, this commit will allow getVariantInfo() to give truly meaningful warning and error messages when poorly formatted variants are given to it. Secondly, all sorts of cases have been added to the getVariantInfo testcase, to make sure even the strangest of formats are being tested.
ifokkema
requested changes
Nov 22, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through part of the code, not done yet, but enough for today 😝
ifokkema
changed the title
Improve/get variant info
WIP: Improve Nov 24, 2021
lovd_getVariantInfo()
and lovd_fixHGVS()
ifokkema
requested changes
Nov 24, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two minor things that I noticed.
- Testcase has been extended - Multiple fixes have been added to fixHGVS
ifokkema
reviewed
Nov 29, 2021
ifokkema
reviewed
Nov 29, 2021
ifokkema
reviewed
Nov 29, 2021
ifokkema
reviewed
Nov 29, 2021
ifokkema
reviewed
Nov 29, 2021
g.?_(?_?)insAAAA can still be fixed.
These are constructed from VCF files and are translated into substitutions. The NN>N or N>NN types we already recognize. Now also recognize variants where either the REF or the ALT was left empty, like N>. or even .>N - both of which are actually bugs in the VCF file generator.
... based on a start position (given in the format of lovd_getVariantInfo()) and a variant length (1 being the minimum).
These are variants taken straight from VCF fields, like g.100A>AT that should be g.100_101insT, or even more complex variants like g.100ATGA>AA that should be g.101_102del. Empty ALTs are supported, so g.100A>. becomes g.100del. However, empty REFs are not supported as it's unclear where the insertion should take place. Either way, an empty ALT is not a valid VCF file.
These now throw an ENOTSUPPORTED as long as they don't match the regex; otherwise they already threw a WSUFFIXGIVEN. We currently don't prevent this as that's for a later step when we decide to properly support them. At least like this, we'll get positions and we can recognize them and therefore allow these variants to be entered in LOVD.
The first is extracted and processed; then the type is overwritten by ";" and an ENOTSUPPORTED is added. Possible warnings that can occur from correct HGVS descriptions of combined variants are removed. This suffices for now.
Added tests for lovd_fixHGVS() that were recently added for lovd_getVariantInfo().
Positions in the 3' UTR weren't handled well yet.
Removed some unneded code, resorted two tests, re-added one test.
ifokkema
changed the title
WIP: Improve
Improve Feb 10, 2022
lovd_getVariantInfo()
and lovd_fixHGVS()
lovd_getVariantInfo()
and lovd_fixHGVS()
ifokkema
approved these changes
Feb 10, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Big update on getVariantInfo:
Related to #550, #580, #581.
Closes #566.