The module consists of rules to call mitochondrial variants from .bam
files. Mitochondorial SNV and Indels are called using mutect2 following the GATK Best Practice Workflow for mitochondrial short variants.
In order to use this module, the following dependencies are required:
Input data should be added to samples.tsv
and units.tsv
.
The following information need to be added to these files:
Column Id | Description |
---|---|
samples.tsv |
|
sample | unique sample/patient id, one per row |
units.tsv |
|
sample | same sample/patient id as in samples.tsv |
type | data type identifier (one letter), can be one of Tumor, Normal, RNA |
platform | type of sequencing platform, e.g. NovaSeq |
machine | specific machine id, e.g. NovaSeq instruments have @Axxxxx |
flowcell | identifer of flowcell used |
lane | flowcell lane number |
barcode | sequence library barcode/index, connect forward and reverse indices by + , e.g. ATGC+ATGC |
fastq1/2 | absolute path to forward and reverse reads |
adapter | adapter sequences to be trimmed, separated by comma |
The workflow repository contains a small test dataset .tests/integration
which can be run like so:
$ cd .tests/integration
$ snakemake -s ../../Snakefile -j1 --use-singularity
To use this module in your workflow, follow the description in the
snakemake docs.
Add the module to your Snakefile
like so:
module mitochondrial:
snakefile:
github(
"hydra-genetics/mitochondrial",
path="workflow/Snakefile",
tag="1.0.0",
)
config:
config
use rule * from mitochondrial as mitochondrial_*
The following output files should be targeted via another rule:
File | Description |
---|---|
mitochondrial/gatk_select_variants_final/{sample}_{type}.vcf |
mitochondrial .vcf from mutect2 |
mitochondrial/gatk_collect_wgs_metrics/{sample}_{type}_mt.metrics.txt |
mitochondrial coverage metrics .txt from CollectWgsMetrics |