DeepVariant 1.1.0
The v1.1 release introduces DeepTrio, which uses a model specifically trained to call a mother-father-child trio or parent-child duo. DeepTrio has superior accuracy compared to DeepVariant. Pre-trained models are available for Illumina WGS, Illumina exome, and PacBio HiFi.
In addition, DeepVariant v1.1 contains the following improvements:
- Accuracy improvements on PacBio, reducing Indel errors by ~21% on the case study. This is achieved by adding an input channel which specifically encodes haplotype information, as opposed to only sorting by haplotype in v1.0. The flag is
--add_hp_channel
which is enabled by default for PacBio. - Speed improvements for long read data by more efficient handling of long CIGAR strings.
- New functionality to add detailed logs for runtime of make_examples by genomic region, viewable in an interactive visualization.
- We now fully withhold HG003 from all training, and report all accuracy evaluations on HG003. We continue to withhold chromosome20 from training in all samples.
New optional flags to increase speed:
A team at Intel has adapted DeepVariant to use the OpenVINO toolkit, which further accelerates
TensorFlow applications. This further speeds up the call_variants stage by ~25% for any model when run in CPU mode on an Intel machine. DeepVariant runs of OpenVINO have the same accuracy and are nearly identical to runs without. Runs with OpenVINO are fully reproducible on OpenVINO.
To use OpenVINO, add the following flag too the DeepVariant command:
--call_variants_extra_args "use_openvino=true"
We thank Intel for their contribution, and acknowledge the extensive work their team put in, captured in (#363)