Skip to content

Phenopackets and Structural Variants

Compare
Choose a tag to compare
@julesjacobsen julesjacobsen released this 23 Sep 13:53
· 314 commits to master since this release

This release is primarily focussed on enabling simultaneous prioritisation of structural and non-structural variation
with as consistent an API as possible for both types of variation. It also introduces a new API for specifying richer
information about a Sample based on the v1 GA4GH Phenopacket

This release requires data version >= 2109 and Java version >= 11 (Java 17 recommended).

Until we're able to upload the data to the usual data.monarchinitiative.org/exomiser/latest you can download the data using these links:

2109_hg19
2109_hg38
2109_phenotype

CLI changes

  • Minimum Java version is now set to Java 11
  • New structural variant interpretation alongside small variants - requires data version 2109 or higher. This has
    been tested using Manta and Canvas short-read callers and pbsv long-read caller.
  • New command line options for more flexible input: --sample --output, --vcf, --batch, --preset --assembly --ped . Run
    --help for details
  • Phenopackets v1.0 can be used to input sample phenotype data
  • Added ability to specify proband age and sex in input options either via a phenopacket or the 'sample' format
  • Improved MOI disease - phenotype matching with added Orphanet MOIs
  • Improved incomplete penetrance calculation when using the ANY mode of inheritance option
  • Added a minExomiserGeneScore option for limiting the output genes to have a mimimum Exomiser combined score. This is
    disabled by default. If enabling it, we recommend using a minimum score of 0.7
  • BREAKING CHANGE - JSON output changes pos renamed as start, chrmosomeName renamed as contigName.
    Deleted chromosome field (use contigName). New fields: end, length, changeLength and variantType

Core API

API breaking changes:

  • New target Java version set to 11
  • Exomiser.run() now requires Sample and Analysis arguments
  • AnalysisRunner interface now requires Sample and Analysis arguments
  • Analysis fields vcfPath, pedigree, probandSampleName and genomeAssembly moved to new Sample class
  • PedigreeSampleValidator moved from util into new sample package
  • Replaced SampleIdentifierUtil with SampleIdentifiers class
  • Replaced SampleIdentifier with SampleData
  • Variant now extends org.monarchinitiative.svart.Variant - see https://github.com/exomiser/svart/ for details
  • Deprecated VariantCoordinates - replaced by org.monarchinitiative.svart.Variant
  • VariantEvaluation.getSampleGenotypes() now returns a SampleGenotypes class
  • Changed VariantAnnotation from implementing Variant to implementing new VariantAnnotations interface
  • Updated variant coordinates getChromosome(), chromosomeName(), getPosition(), getRef(), getAlt() to
    use Svart contigId(), contigName(), start(), end(), ref() and alt() signatures
  • Replaced RsId with String type in FrequencyData constructors and return from hasDbSnpRsID() method
  • Replaced Contig class with new Contigs class
  • VariantAnnotator interface changed to List<VariantAnnotation> annotate(@Nullable Variant variant)
  • VariantContextSampleGenotypeConverter.createAlleleSampleGenotypes() method now returns a SampleGenotypes object
  • VariantFactory now a @FunctionalInterface with a createVariantEvaluations()
  • VariantFactoryImpl now requires VariantAnnotator and VcfReader input arguments
  • VcfCodecs now requires List rather than Set inputs

New APIs:

  • New protobuf schemas for Job, Analysis, Sample, OutputOptions
  • New Exomiser.run(JobProto.Job job) entry point
  • New FluentAnalysisBuilder interface implemented by AnalysisBuilder and new AnalysisProtoBuilder for consistent
    API between proto and domain classes
  • New AnalysisGroup class extracted from AbstractAnalysisRunner
  • New Sample class to encapsulate data about the sample, such as Age and Sex
  • New Age class
  • New Phenopacket... classes for reading and converting sample data from v1 phenopackets
  • New Proto converter classes
  • New SampleIdentifiers class
  • New SampleData class to contain sampleIdentifier, SampleGenotype and CopyNumber
  • New SampleGenotypes class to handle
  • New CopyNumber class for handling copy number variation data from VCF
  • New AbstractVariant class
  • New VariantAnnotations interface
  • New AlleleCall.parseAlleleCall() method
  • New Pedigree justProband(String id, Individual.Sex sex)) and anscestorsOf(Pedigree.Individual individual)
    methods
  • New SvFrequencyDao, SvPathogenicityDao and SvDaoUtil classes
  • New VariantWhiteListLoader class
  • New JannovarAnnotationService.annotateSvGenomeVariant() method
  • New JannovarSmallVariantAnnotator class
  • New JannovarStructuralVariantAnnotator class
  • New TranscriptModelUtil class
  • New VcfReader interface with VcfFileReader and NoOpVcfReader implementations
  • New VariantContextConverter class for converting VariantContext objects into Variant

Other changes:

  • Updated Spring Boot to version 2.5.3
  • Updated Jannovar to version 0.30
  • Updated HTSJDK to version 2.24.1
  • AnalysisResults now hold references to original Sample and Analysis objects
  • GenomeAnalysisService can now return a VariantAnnotator object
  • GenomeAssembly now wraps two GenomicAssembly objects
  • Added ClinVarData starRating() and isSecondaryAssociationRiskFactorOrOther() methods
  • Added DBVAR, DECIPHER, DGV, GNOMAD_SV and GONL SV FrequencySource
  • Updated VariantEffectPathogenicityScore to become final and added default inversion score
  • Numerous small changes to improve performance.