Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing converts QUAL == zero to period, inconsistent numeric type breaks downstream filtering #329

Open
katrinakalantar opened this issue May 3, 2021 · 1 comment

Comments

@katrinakalantar
Copy link

Hello,

When parsing an input .vcf file with a QUAL value = 0.0, i.e.: MN908947.3 12739 . T TA 0.0 PASS DP=3.0;DPS=3.0,0.0 GT:GQ 1:0

the float QUAL value is converted to a . which breaks float comparison filters in downstream applications (specifically working with artic minion pipeline where the above line is parsed to create the following: MN908947.3 12739 . T TA . PASS DP=3.0;DPS=3.0,0.0;Pool=nCoV-2019_2 GT:GQ 1:0)

It appears that this occurs as a function of this logic: https://github.com/jamescasbon/PyVCF/blob/master/vcf/parser.py#L696

Would it be reasonable to require consistent types for the QUAL field when parsing (all numeric)?

@cjw85
Copy link

cjw85 commented Jun 24, 2021

I don't think it correct to require that all QUAL fields be numeric, there is no constraint in the VCF specification to this end. A QUAL of . is perfectly valid for a single record, and stands independent of the content of other records.

It is certainly incorrect however that QUALs of 0.0 are being coerced to Boolean false in that code and then replaced with the missing value ., which has a very different meaning to zero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants