Merging gvcf with GLnexus introduces non-zero heterozygous PL in hemizygous PAR #811

ASLeonard · 2024-04-23T13:25:51Z

This is a follow up from an email thread with @pichuan, where DeepVariant correctly produces only hemizygous genotypes in regions indicated as PAR, but through GLnexus merging and later imputation can gain "impossible" heterozygous genotypes.

The probably mainly seems to originate in GLnexus merging, where uncalled alleles (e.g., for "GT:DP:AD:GQ:PL:RNC" "./.:5:5,0:0:14,0,104:II" where I = gVCF input site is non-called) has a non-zero likelihood of being heterozygous according the PL tag, which is later imputed to a "0/1" genotype. But this also occurs for samples called as reference (0/0) but with non-zero PL tags for 0/1.

For example, the third sample gains a heterozygous genotype at site Y:6899016 after imputation, and has the gvcf block

Y       6898989 .       C       <*>     0       .       END=6899117     GT:GQ:MIN_DP:MED_DP:PL  0/0:1:0:0:0,3,29

leads to the GLnexus merged GT

Y       6899016 Y_6899016_T_C   T       C       47      .      0/0:0:0,0:1:0,3,29:..

and then the imputed

Y       6899016 Y_6899016_T_C   T       C       47      .      0/1:1:0,1,0

variant calling with DeepVariant

Running with singularity image v1.6.1.
This was done with the "WGS" model, using --channels "insert_size" --include_med_dp --gvcf <gvcf> for make_examples and --haploid_contigs "X,Y" --par_regions_bed <PAR.bed> for postprocess.

merging with GLnexus

Running with singularity image v1.4.1-0-g68e25e5.
This uses a modified DeepVariant config, where there is no genotype revision (revise_genotypes: false).

/usr/local/bin/glnexus_cli \
        --dir $TMPDIR/GLnexus.DB \
        --config <deepvariant_preset_with_revise_genotypes_false> \
        --threads 4 \
        --mem-gbytes 20 \
        *.g.vcf |\
        bcftools view - |\
        bgzip -@ 8 -c > Yhap.Unrevised.vcf.gz

imputing with beagle4

This is where the "./." genotypes with non-zero het GL turn into "0/1".

java -jar -Xss25m -Xmx50G beagle.27Jan18.7e1.jar ne=100 gl=Yhap.Unrevised.vcf.gz out=Yhap.beagle4

The unrevised (glnexus output, with no hets) and imputed (beagle output, with hets) variants are here, along with the gvcfs. I can provide the alignments, references, etc. if you need as these samples are already public.
Y_haploid_vcf.tar.gz

Best,
Alex

The text was updated successfully, but these errors were encountered:

pichuan · 2024-05-06T23:56:22Z

Hi @ASLeonard ,
A quick update: We think this is because we need to adjust our PL in our gVCF as well.
I'm working on a code change, but it might take a bit longer than I thought.
I just want to let you know that this is still in my queue!

pichuan · 2024-07-18T05:44:45Z

Update: Internally our team has been making other improvements to postprocess_variants, so I've not been actively looking into this issue. I'll plan to resume in the next few weeks.

pichuan · 2024-08-19T23:21:40Z

Update: I've done some investigation, and I think we need to change the logic in make_examples, specifically somewhere around here:

https://github.com/google/deepvariant/blob/r1.6.1/deepvariant/variant_caller.py#L207-L230

I've done a prototype but the behavior isn't quite what I expected yet. I'll continue to work on this.

pichuan self-assigned this Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging gvcf with GLnexus introduces non-zero heterozygous PL in hemizygous PAR #811

Merging gvcf with GLnexus introduces non-zero heterozygous PL in hemizygous PAR #811

ASLeonard commented Apr 23, 2024

pichuan commented May 6, 2024

pichuan commented Jul 18, 2024

pichuan commented Aug 19, 2024

Merging gvcf with GLnexus introduces non-zero heterozygous PL in hemizygous PAR #811

Merging gvcf with GLnexus introduces non-zero heterozygous PL in hemizygous PAR #811

Comments

ASLeonard commented Apr 23, 2024

variant calling with DeepVariant

merging with GLnexus

imputing with beagle4

pichuan commented May 6, 2024

pichuan commented Jul 18, 2024

pichuan commented Aug 19, 2024