Skip to content

4.0.10.1

Compare
Choose a tag to compare
@droazen droazen released this 09 Oct 19:29
· 1149 commits to master since this release
4dd7ba8

This is a small release that improves the calculation of the MQ (mapping quality) annotation, which provides an estimate of the overall mapping quality of reads supporting a variant call. It also introduces a number of experimental improvements to the CNV workflows, as well as a bug fix to LocusWalkerSpark.

As usual, a docker image for this release can be downloaded from https://hub.docker.com/r/broadinstitute/gatk/

Full list of changes in this release:

  • Improve MQ calculation accuracy (#4969)

    • Change raw MQ to a tuple of (sumSquaredMQs, totalDepth) for better accuracy where there are lots of uninformative reads or called single-sample variants with homRef genotypes.
    • Note that incorporating this change into a pipeline will require a concomitant update to this version for GenomicsDBImport and GenotypeGVCFs.
  • Updated SimpleGermlineTagger and somatic CNV experimental post-processing workflow with several experimental changes that improve precision results, and expand possible evaluations, of GATK CNV (#5252)

    • New script combine_tracks.wdl for post-processing somatic CNV calls. This wdl will perform two operations:
      • Increases precision by removing:
        • germline segments. As a result, the WDL requires the matched normal segments.
        • Areas of common germline activity or error from other cancer studies.
      • Converts the tumor model seg file to the same format as AllelicCapSeg, which can be read by ABSOLUTE. This is currently done inline in the WDL.
        • This is not a trivial conversion, since each segment must be called whether it is balanced or not (MAF =? 0.5). The current algorithm relies on hard filtering and may need updating pending evaluation.
        • For more information about AllelicCapSeg and ABSOLUTE, see:
          • Carter et al. Absolute quantification of somatic DNA alterations in human cancer, Nat Biotechnol. 2012 May; 30(5): 413–421
          • https://software.broadinstitute.org/cancer/cga/absolute
          • Brastianos, P.K., Carter S.L., et al. Genomic Characterization of Brain Metastases Reveals Branched Evolution and Potential Therapeutic Targets (2015) Cancer Discovery PMID:26410082
    • Changes to GATK tools to support the above:
      • SimpleGermlineTagger now uses reciprocal overlap to in addition to breakpoint matching when determining a possible germline event. This greatly improved results in areas near centromeres.
      • Added tool MergeAnnotatedRegionsByAnnotation. This simple tool will merge genomic regions (specified in a tsv) when given annotations (columns) contain exact values in neighboring segments and the segments are within a specified maximum genomic distance.
    • New scripts multi_combine_tracks.wdl and aggregate_combine_tracks.wdl which run combine_tracks.wdl on multiple pairs and combine the results into one seg file for easy consumption by IGV.
  • LocusWalkerSpark: fix issue where intervals with no reads were being dropped (#5222)

    • This fixes the bug reported in #3823
  • Added SparkTestUtils.roundTripThroughJavaSerialization() method for better serialization testing on Spark (#5257)

  • Build system: set the same compiler flags for all gradle JavaCompile tasks (#5256)