Skip to content

decompose

Brent Pedersen edited this page Jun 17, 2022 · 2 revisions

echtvar requires variants to have a single alternate allele per line. Many variant callers will output multi-allelic variants, having multiple alternate alleles per line. In gnomAD, for example, more than 10% of variants are multi-allelic, however, the distributed files are decomposed so that there is always only one alternate allele per VCF record.

Users can decompose and normalize their variants for use by echtvar and other tools with the following command:

bcftools norm -m $input_vcf -w 10000 -f $fasta -O b -o $clean_bcf

More information about this process can be found in this excellent paper: https://academic.oup.com/bioinformatics/article/31/13/2202/196142

Clone this wiki locally