Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello,
I was recently trying to use Genotyphi for subtyping S. Typhi genomes and encountered some issues integrating the tool with the rest of the tools in a workflow, so I forked and refactored the code to be compatible with Python 3.7+ and to understand it a bit better.
I hope the changes help the project.
Thanks,
Peter
Changes:
subprocess
piping of output fromsamtools mpileup
tobcftools call
to reduce disk IO\r
characters)$genotyphi --args ... > genotyphi-output.txt
or pipe it into other programsAlso, one thing I noticed is that multiple lists are used to describe linked data where one dict of tuples could be used instead, e.g.
The above could be converted to the following:
This change would make it easier to add/update loci/groups/alleles, and when checking if a variant from a VCF file is one of the genotyping markers, search for the key in a
snp_markers
dict would be in constant time rather than linear time withloci.index(variant_location)
, e.g.