4.0.10.1
This is a small release that improves the calculation of the MQ
(mapping quality) annotation, which provides an estimate of the overall mapping quality of reads supporting a variant call. It also introduces a number of experimental improvements to the CNV workflows, as well as a bug fix to LocusWalkerSpark
.
As usual, a docker image for this release can be downloaded from https://hub.docker.com/r/broadinstitute/gatk/
Full list of changes in this release:
-
Improve MQ calculation accuracy (#4969)
- Change raw MQ to a tuple of (sumSquaredMQs, totalDepth) for better accuracy where there are lots of uninformative reads or called single-sample variants with homRef genotypes.
- Note that incorporating this change into a pipeline will require a concomitant update to this version for GenomicsDBImport and GenotypeGVCFs.
-
Updated
SimpleGermlineTagger
and somatic CNV experimental post-processing workflow with several experimental changes that improve precision results, and expand possible evaluations, of GATK CNV (#5252)- New script
combine_tracks.wdl
for post-processing somatic CNV calls. This wdl will perform two operations:- Increases precision by removing:
- germline segments. As a result, the WDL requires the matched normal segments.
- Areas of common germline activity or error from other cancer studies.
- Converts the tumor model seg file to the same format as AllelicCapSeg, which can be read by ABSOLUTE. This is currently done inline in the WDL.
- This is not a trivial conversion, since each segment must be called whether it is balanced or not (MAF =? 0.5). The current algorithm relies on hard filtering and may need updating pending evaluation.
- For more information about AllelicCapSeg and ABSOLUTE, see:
- Carter et al. Absolute quantification of somatic DNA alterations in human cancer, Nat Biotechnol. 2012 May; 30(5): 413–421
- https://software.broadinstitute.org/cancer/cga/absolute
- Brastianos, P.K., Carter S.L., et al. Genomic Characterization of Brain Metastases Reveals Branched Evolution and Potential Therapeutic Targets (2015) Cancer Discovery PMID:26410082
- Increases precision by removing:
- Changes to GATK tools to support the above:
SimpleGermlineTagger
now uses reciprocal overlap to in addition to breakpoint matching when determining a possible germline event. This greatly improved results in areas near centromeres.- Added tool
MergeAnnotatedRegionsByAnnotation
. This simple tool will merge genomic regions (specified in a tsv) when given annotations (columns) contain exact values in neighboring segments and the segments are within a specified maximum genomic distance.
- New scripts
multi_combine_tracks.wdl
andaggregate_combine_tracks.wdl
which runcombine_tracks.wdl
on multiple pairs and combine the results into one seg file for easy consumption by IGV.
- New script
-
LocusWalkerSpark
: fix issue where intervals with no reads were being dropped (#5222)- This fixes the bug reported in #3823
-
Added
SparkTestUtils.roundTripThroughJavaSerialization()
method for better serialization testing on Spark (#5257) -
Build system: set the same compiler flags for all gradle JavaCompile tasks (#5256)