Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Merge master to ah var store again [VS-1178] #8890

Draft
wants to merge 66 commits into
base: ah_var_store
Choose a base branch
from

Commits on Oct 11, 2023

  1. Configuration menu
    Copy the full SHA
    d40a485 View commit details
    Browse the repository at this point in the history
  2. Fixed Funcotator VCF output renderer to correctly preserve B37 contig…

    … names on output for B37 aligned files (#8539)
    jamesemery authored Oct 11, 2023
    Configuration menu
    Copy the full SHA
    423d106 View commit details
    Browse the repository at this point in the history

Commits on Nov 3, 2023

  1. Configuration menu
    Copy the full SHA
    2900e01 View commit details
    Browse the repository at this point in the history

Commits on Nov 13, 2023

  1. Ultima.flow annotations.fix (#8442)

    * hmer ondel must have mon length
    
    * Revert "hmer ondel must have mon length"
    
    This reverts commit 7852871.
    
    * remove superfluous variant type condition
    
    * fix error message to actually reflect missing argument
    
    * fixed unittest to include variant type
    
    * Remove conflict
    dror27 authored Nov 13, 2023
    Configuration menu
    Copy the full SHA
    683eaa8 View commit details
    Browse the repository at this point in the history
  2. Removes unnecessary and buggy validation check (#8580)

    * Additional fix + logging fixes
    * Added missing initialization
    ilyasoifer authored Nov 13, 2023
    Configuration menu
    Copy the full SHA
    e6e4dea View commit details
    Browse the repository at this point in the history

Commits on Nov 14, 2023

  1. Configuration menu
    Copy the full SHA
    1dc7ee4 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2023

  1. Configuration menu
    Copy the full SHA
    7a08754 View commit details
    Browse the repository at this point in the history

Commits on Nov 26, 2023

  1. Add option to AnalyzeSaturationMutagenesis to keep disjoint mates (#8557

    )
    
    * Add option for keeping disjoint mates in ASM
    
    * Better name and fixing reports
    
    * Finish fixing report
    
    * Fix report name
    odcambc authored Nov 26, 2023
    Configuration menu
    Copy the full SHA
    fa3dfed View commit details
    Browse the repository at this point in the history

Commits on Nov 28, 2023

  1. New/Updated Flow Based Read tools (#8579)

    New Tool: GroundTruthScorer
    Update: FlowFeatureMapper
    dror27 authored Nov 28, 2023
    Configuration menu
    Copy the full SHA
    0da6409 View commit details
    Browse the repository at this point in the history

Commits on Dec 6, 2023

  1. Configuration menu
    Copy the full SHA
    e37b344 View commit details
    Browse the repository at this point in the history

Commits on Dec 8, 2023

  1. Add a native GATK implementation for 2bit references, and remove the …

    …dependency on the ADAM library (#8606)
    
    * Add a native GATK implementation for 2bit references, with comprehensive unit tests
    
    * For now, this is only hooked up to the Spark codepath, but it could easily be hooked up to ReferenceDataSource and the Walker codepath as well
    
    * Remove the dependency on the ADAM library, to resolve conflicts with future dependency upgrades
    droazen authored Dec 8, 2023
    Configuration menu
    Copy the full SHA
    bf24519 View commit details
    Browse the repository at this point in the history
  2. Update dependencies to address security vulnerabilities, and add a se…

    …curity scanner to build.gradle (#8607)
    
    * Updated many GATK dependencies to address known security vulnerabilities
    
    * Added a security scanner to build.gradle
    
    * There are still some remaining vulnerabilities in GATK dependencies, but this eliminates most of them
    droazen authored Dec 8, 2023
    Configuration menu
    Copy the full SHA
    e2c5fab View commit details
    Browse the repository at this point in the history

Commits on Dec 9, 2023

  1. Update http-nio and wire its new settings (#8611)

    * Update http-nio and wire it so it's configured at startup along with GCS setttings.
    lbergelson authored Dec 9, 2023
    Configuration menu
    Copy the full SHA
    3b8b5bf View commit details
    Browse the repository at this point in the history
  2. PrintFileDiagnostics for cram, crai and bai. (#8577)

    * New experimental tool to print out human readable file diagnostics for cram/crai/bai files.
    cmnbroad authored Dec 9, 2023
    Configuration menu
    Copy the full SHA
    5839cbd View commit details
    Browse the repository at this point in the history
  3. Allow GenomicsDBImport to connect to az:// files without interference (

    …#8438)
    
    * GATK's lack of support for az:// URIs means that although GenomicsDB can
      natively read them, parts of the java code crash when interacting with them
    * Adding --avoid-nio and --header arguments
      These allow disabling all of the java interaction with the az:// links
      and simply passing  them through to genomicsdb
      This disables some safeguards but allows operating on files in azur
    * Update GenomicsDB version to 1.5.1 for azure improved support
    
    * There are no direct tests on azure since we do not yet have any infrastructure
      to generate the necessary tokens, there is a disabled test which requires
      #8612 before we can enable it.
    
    ---------
    
    Co-authored-by: Nalini Ganapati <[email protected]>
    Co-authored-by: Nalini Ganapati <[email protected]>
    3 people authored Dec 9, 2023
    Configuration menu
    Copy the full SHA
    2ad4a3e View commit details
    Browse the repository at this point in the history

Commits on Dec 11, 2023

  1. Configuration menu
    Copy the full SHA
    e29cbc3 View commit details
    Browse the repository at this point in the history
  2. Support for custom ploidy regions in HaplotypeCaller (#8609)

    For having variable ploidy in different regions, like making haploid calls outside the PAR on chrX or chrY, 
    there is now a --ploidy-regions flag. The -ploidy flag sets the default ploidy to use everywhere, and --ploidy-regions
    should be a .bed or .interval_list with "name" column containing the desired ploidy to use in that region
    when genotyping. Note that variants near the boundary may not have the matching ploidy since the ploidy used will be determined using the following precedence:
    
    * ploidy given in --ploidy-regions for all intervals overlapping the active region when calling your variant
      with ties broken by using largest ploidy); note ploidy interval may only overlap the active region and determine 
      the ploidy of your variant even if the end coordinate written for your variant lies outside the given region
    * ploidy given via global -ploidy flag
    * ploidy determined by the default global built-in constant for humans (2).
    
    ---------
    
    Co-authored-by: Ty Kay <[email protected]>
    Co-authored-by: rickymagner <[email protected]>
    3 people authored Dec 11, 2023
    Configuration menu
    Copy the full SHA
    0b18579 View commit details
    Browse the repository at this point in the history

Commits on Dec 12, 2023

  1. Update the GATK base image to a newer LTS ubuntu release (#8610)

    * Update the GATK base image to the latest Ubuntu LTS release (22.04)
    
    * Add some additional useful utilities to the base image
    
    * Switch to a newer conda version with a much faster solver
    
    * Update the scripts and documentation for building the base image
    
    * Update the VETS integration tests to allow for a small epsilon during numeric comparisons, and include the full diff output in exception messages when a mismatch is detected
    droazen authored Dec 12, 2023
    Configuration menu
    Copy the full SHA
    85d13d4 View commit details
    Browse the repository at this point in the history

Commits on Dec 13, 2023

  1. build_docker_remote: add ability to specify the RELEASE arg to the cl…

    …oud-based docker build, and add a release script (#8247)
    
    * Added a -r argument to build_docker_remote.sh to toggle the RELEASE flag during
      docker builds
    
    * Added a release_prebuilt_docker_image.sh to release a prebuilt docker image to the
      official repos
    droazen authored Dec 13, 2023
    Configuration menu
    Copy the full SHA
    75f5104 View commit details
    Browse the repository at this point in the history
  2. Update to htsjdk 4.1.0 (#8620)

    * update to htsjdk 4.1.0 which enables http-nio in more cases
    * remove several test cases handling genomicsdb path parsing which were testing nonsensical paths that are now illegal
    lbergelson authored Dec 13, 2023
    Configuration menu
    Copy the full SHA
    23c8071 View commit details
    Browse the repository at this point in the history
  3. Fix the Spark version in the GATK jar manifest, and used the right co…

    …nstant in build.gradle (#8625)
    droazen authored Dec 13, 2023
    Configuration menu
    Copy the full SHA
    70ee553 View commit details
    Browse the repository at this point in the history
  4. Update http-nio to 1.1.0 which implements Path.resolve() methods (#8626)

    * This should make http access seamless in many places
    
    * The way this handles query parameters is not ideal for signed url cases so we'll need to revisit that
    lbergelson authored Dec 13, 2023
    Configuration menu
    Copy the full SHA
    8317d8b View commit details
    Browse the repository at this point in the history

Commits on Dec 14, 2023

  1. Fix GT header in PostprocessGermlineCNVCalls's --output-genotyped-int…

    …ervals output (#8621)
    
    * Write gCNV interval output ID=GT header as Type=String
    
    Incorrectly writing this as Type=Integer causes bcftools to misparse
    the genotype field.
    
    * Use correct header types and numbers in test VCF file
    jmarshall authored Dec 14, 2023
    Configuration menu
    Copy the full SHA
    fd873e9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    39cfbba View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b68fadc View commit details
    Browse the repository at this point in the history

Commits on Jan 23, 2024

  1. Configuration menu
    Copy the full SHA
    e796d20 View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2024

  1. Improvements to Mutect2's Permutect training data mode (#8663)

    * include normal seq error log likelihood in Permutect dataset
    
    * handle different alelle representations in multiallelic / indel variants for Permutect training data mode
    
    * set the default artifact to non-artifact ratio to 1 in Permutect training data mode
    davidbenjamin authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    2d50cf8 View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2024

  1. Configuration menu
    Copy the full SHA
    dd73036 View commit details
    Browse the repository at this point in the history

Commits on Feb 7, 2024

  1. Configuration menu
    Copy the full SHA
    bbc028b View commit details
    Browse the repository at this point in the history

Commits on Feb 15, 2024

  1. Configuration menu
    Copy the full SHA
    cfd4d87 View commit details
    Browse the repository at this point in the history

Commits on Mar 5, 2024

  1. Move to GenomicsDB 1.5.2 which supports M1 macs (#8710)

    * Support for MacOS universal builds (intel AND M1)
    * Catch JNI importer exceptions and propagate them as java IOExceptions
    * Turn off HDFS support by default
    
    Co-authored-by: Nalini Ganapati <[email protected]>
    nalinigans and nalinigans authored Mar 5, 2024
    Configuration menu
    Copy the full SHA
    c97faf6 View commit details
    Browse the repository at this point in the history

Commits on Mar 7, 2024

  1. Standardize test results directory between normal/docker tests (#8718)

    Normalize the test results location between docker and non docker tests
    lbergelson authored Mar 7, 2024
    Configuration menu
    Copy the full SHA
    a2ebb37 View commit details
    Browse the repository at this point in the history
  2. fix no data hom refs (#8715)

    Output GQ0 genotypes from reference blocks as no-call rather than hom-ref
    ldgauthier authored Mar 7, 2024
    Configuration menu
    Copy the full SHA
    b0463e4 View commit details
    Browse the repository at this point in the history

Commits on Mar 8, 2024

  1. Update the setup_cloud github action (#8651)

    * update setup-gcloud@v0 -> v2 since v0 is deprecated
    lbergelson authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    c3599a0 View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2024

  1. Configuration menu
    Copy the full SHA
    9af1be3 View commit details
    Browse the repository at this point in the history
  2. added --inverted-read-filter argument to allow for selecting reads th…

    …at fail read filters from the command line easily (#8724)
    jamesemery authored Mar 11, 2024
    Configuration menu
    Copy the full SHA
    b81a638 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2640404 View commit details
    Browse the repository at this point in the history

Commits on Mar 12, 2024

  1. Fix to long deletions that overhang into the assembly window causing …

    …exceptions in HaplotypeCaller (#8731)
    jamesemery authored Mar 12, 2024
    Configuration menu
    Copy the full SHA
    8ee86e7 View commit details
    Browse the repository at this point in the history

Commits on Mar 18, 2024

  1. Configuration menu
    Copy the full SHA
    141529b View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2024

  1. Update README to include list of popular software included in docker …

    …image (#8745)
    
    * Update README to include list of popular software included in docker image
    rickymagner authored Mar 19, 2024
    Configuration menu
    Copy the full SHA
    dcfaa06 View commit details
    Browse the repository at this point in the history

Commits on Mar 20, 2024

  1. Configuration menu
    Copy the full SHA
    47a97ae View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2024

  1. Make M2 haplotype and clustered events filters smarter about germline…

    … events (#8717)
    
    * M2 bad haplotype filter does not filter variants that share a haplotype with a germline event
    
    * two ECNT annotations -- haplotype and region -- and clustered events filter looks at both
    davidbenjamin authored Mar 25, 2024
    Configuration menu
    Copy the full SHA
    105b63e View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2024

  1. Funcotator: suppress a log message about b37 contigs when not doing b…

    …37/hg19 conversion (#8758)
    
    Don't print the very long and misleading "The following contigs are present in b37 and
    missing in the input VCF sequence dictionary" log message when we're not even doing b37/hg19
    conversion.
    droazen authored Apr 1, 2024
    Configuration menu
    Copy the full SHA
    724b5bc View commit details
    Browse the repository at this point in the history

Commits on Apr 4, 2024

  1. SNVQ recalibration tool added for flow based reads (#8697)

    Co-authored-by: Dror Kessler <[email protected]>
    ilyasoifer and Dror Kessler authored Apr 4, 2024
    Configuration menu
    Copy the full SHA
    6739e6d View commit details
    Browse the repository at this point in the history

Commits on Apr 10, 2024

  1. Several GQ0 cleanup changes: (#8741)

    Set GGVCFs --all-sites GQ0 hom-refs to no-calls
    Set regular GGVCFs GQ0 hom-refs to no-calls (any DP, PL) for better AF/AN annotations
    Remove PLs in "no data" case where DP=0 for more accurate QUAL score
    ldgauthier authored Apr 10, 2024
    Configuration menu
    Copy the full SHA
    7cdc985 View commit details
    Browse the repository at this point in the history

Commits on Apr 11, 2024

  1. Re-commit large files as lfs stubs (#8769)

    Several files tracked by git lfs were accidentally reimported as normal files.
    This makes them stubs again.
    lbergelson authored Apr 11, 2024
    Configuration menu
    Copy the full SHA
    47c4858 View commit details
    Browse the repository at this point in the history

Commits on Apr 12, 2024

  1. Enable ReblockGVCF to subset AS annotations that aren't "raw" (pipe-d…

    …elimited) (#8771)
    
    * Enable ReblockGVCF to subset AS annotations that aren't "raw" (i.e. pipe-delimited)
    
    * Fix tests by removing AssemblyComplexity from default annotations
    ldgauthier authored Apr 12, 2024
    Configuration menu
    Copy the full SHA
    986cb15 View commit details
    Browse the repository at this point in the history

Commits on Apr 18, 2024

  1. Gc getpipeupsummaries use mappingqualityreadfilter (#8781)

    * Add MappingQualityReadFilter
    
    * Added additional warnings for mmq
    
    * Fixed doc typo
    gokalpcelik authored Apr 18, 2024
    Configuration menu
    Copy the full SHA
    aed8b1b View commit details
    Browse the repository at this point in the history

Commits on May 1, 2024

  1. Add malaria spanning deletion exception regression test with fix (#8802)

    * Add malaria spanning deletion exception regression test with fix
    
    * Disabling codecov.
    
    ---------
    
    Co-authored-by: Jonn Smith <[email protected]>
    ldgauthier and jonn-smith authored May 1, 2024
    Configuration menu
    Copy the full SHA
    ec39c37 View commit details
    Browse the repository at this point in the history

Commits on May 2, 2024

  1. Bug fix in flow allele filtering (#8775)

    * Fixed a bug that prevented filtering by SOR in many cases
    ilyasoifer authored May 2, 2024
    Configuration menu
    Copy the full SHA
    5c32785 View commit details
    Browse the repository at this point in the history

Commits on May 6, 2024

  1. Allow for GT to be a nocall if GQ and PL[0] are zero instead of homre…

    …f in GenomicsDB (#8759)
    
    * Allow for GT to be a nocall if GQ and PL[0] are zero instead of homref in GenomicsDB
    
    * Move to 1.5.3 release from snapshot
    
    ---------
    
    Co-authored-by: Nalini Ganapati <[email protected]>
    Co-authored-by: Nalini Ganapati <[email protected]>
    3 people authored May 6, 2024
    Configuration menu
    Copy the full SHA
    24f93b5 View commit details
    Browse the repository at this point in the history

Commits on May 9, 2024

  1. Reduced docker layers in GATK image from 44 to 16 (#8808)

    * Reduced total layers in the GATK docker image from 44 down to 16.
    
    * Reduced GATK base image layers from 20 to 3.
    
    * This might be a better solution than a full squash down to a single layer, because: 
    
    If we are hosting this in a premium ACR, the limit is 10,000 readOps per minute. So with 16 layers, you get around 625 pulls per minute. Also, this will be able to still take advantage of parallel pulls (default is 3, but at most 16 threads in this case, I believe) as opposed to one big layer which will not download in parallel. There's the potential of that being a lot slower and subsequent jobs falling into the same "minute" because others are not done, making it easier to hit that 10k readOps limit. Lastly, people using GATK outside data pipelines will not be able to take advantage of layer caching too.
    
    Resolves #8684
    kevinpalis authored May 9, 2024
    Configuration menu
    Copy the full SHA
    c6daf7d View commit details
    Browse the repository at this point in the history

Commits on May 15, 2024

  1. VariantFiltration: added arg to write custom mask filter description …

    …in VCF header (#8831)
    
    Added a --mask-description argument to VariantFiltration to write a custom description for the mask filter in the VCF header
    meganshand authored May 15, 2024
    Configuration menu
    Copy the full SHA
    a3bbfc4 View commit details
    Browse the repository at this point in the history

Commits on May 17, 2024

  1. Bigger Permutect tensors and Permutect test datasets can be annotated…

    … with truth VCF (#8836)
    
    * added 20 more Permutect read features
    
    * Permutect test data can, like training data, be annotated with a truth VCF
    davidbenjamin authored May 17, 2024
    Configuration menu
    Copy the full SHA
    4ed93fe View commit details
    Browse the repository at this point in the history

Commits on Jun 3, 2024

  1. [BIOIN-1570] Fixed edge case in variant annotation (#8810)

    * [BIOIN-1570] Fixed edge case in variant annotation when the variant is close to the edge of the reference
    ilyasoifer authored Jun 3, 2024
    Configuration menu
    Copy the full SHA
    d4744f7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0be44f2 View commit details
    Browse the repository at this point in the history

Commits on Jun 4, 2024

  1. Configuration menu
    Copy the full SHA
    2878ce5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2a420e4 View commit details
    Browse the repository at this point in the history

Commits on Jun 13, 2024

  1. Configuration menu
    Copy the full SHA
    ab98a5d View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2024

  1. Configuration menu
    Copy the full SHA
    d633a57 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    abef8e1 View commit details
    Browse the repository at this point in the history
  3. Merge remote-tracking branch 'origin/master' into vs_1178_merge_maste…

    …r_to_ah_var_store_again
    mcovarr committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    e170f1b View commit details
    Browse the repository at this point in the history
  4. fix compilation

    mcovarr committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    d4b680c View commit details
    Browse the repository at this point in the history
  5. update Dockers

    mcovarr committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    8e097e1 View commit details
    Browse the repository at this point in the history
  6. Dockstore

    mcovarr committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    e548bd5 View commit details
    Browse the repository at this point in the history

Commits on Jun 24, 2024

  1. Merge remote-tracking branch 'origin/ah_var_store' into vs_1178_merge…

    …_master_to_ah_var_store_again
    mcovarr committed Jun 24, 2024
    Configuration menu
    Copy the full SHA
    164aa5a View commit details
    Browse the repository at this point in the history