Skip to content

Commit

Permalink
Merge branch 'hotfix/fastq-tbprofiler-and-spotyping'
Browse files Browse the repository at this point in the history
  • Loading branch information
abhi18av committed Dec 27, 2024
2 parents 013614f + 7925d4d commit ee4fece
Show file tree
Hide file tree
Showing 13 changed files with 972 additions and 34 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# CHANGELOG FOR THE MAGMA PIPELINE VERSIONS
<!-- https://keepachangelog.com/en/1.1.0/ -->


## v2.0.0

Expand Down
48 changes: 24 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ The `java` version should NOT be an `internal jdk` release! You can check the re
Notice the `LTS` next to `OpenJDK` line.


```bash
```bash

$ java -version
openjdk version "17.0.7" 2023-04-18 LTS
Expand Down Expand Up @@ -90,7 +90,7 @@ S0002,/full_path_to_directory_of_fastq_files/S0002_01_R1.fastq.gz,full_path_to_d
S0003,/full_path_to_directory_of_fastq_files/S0003_01_R1.fastq.gz,
```

If you have the metadata from sequencing instrument, you can specify further information in the samplesheet
If you have the metadata from sequencing instrument, you can specify further information in the samplesheet

```csv
Study,Sample,Library,Attempt,R1,R2,Flowcell,Lane,Index Sequence
Expand Down Expand Up @@ -156,15 +156,15 @@ Which could be provided to the pipeline using `-params-file` parameter as shown

```console
nextflow run 'https://github.com/TORCH-Consortium/MAGMA' \
-profile conda_local, server \
-r v1.1.1 \
-params-file my_parameters_1.yml
-profile conda_local, server \
-r v1.1.1 \
-params-file my_parameters_1.yml
```

# Analysis

## Running MAGMA using Nextflow Tower
## Running MAGMA using Nextflow Tower

You can also use Seqera Platform (aka Nextflow Tower) to run the pipeline on any of the supported cloud platforms and monitoring the pipeline execution.

Expand All @@ -181,11 +181,11 @@ You can run the pipeline using Conda, Mamba or Micromamba package managers to in
You can find out the location of conda environments using `conda env list`. [Here's](https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf) a useful cheatsheet for conda operations.


You can use the `conda` based setup for the pipeline for running MAGMA
You can use the `conda` based setup for the pipeline for running MAGMA
- On a local linux machine(e.g. your laptop or a university server)
- On an HPC cluster (e.g. SLURM, PBS) in case you don't have access to container systems like Singularity, Podman or Docker
- On an HPC cluster (e.g. SLURM, PBS) in case you don't have access to container systems like Singularity, Podman or Docker

All the requisite softwares have been provided as a `conda` recipe (i.e. `yml` files)
All the requisite softwares have been provided as a `conda` recipe (i.e. `yml` files)
- [magma-env-1.yml](./conda_envs/magma-env-1.yml)
- [magma-env-2.yml](./conda_envs/magma-env-2.yml)

Expand All @@ -208,7 +208,7 @@ $ conda env create -n magma-env-2 --file magma-env-2.yml

Once the environments are created, you can make use of the pipeline parameter `conda_envs_location` to inform the pipeline of the names and location of the conda envs.

Next, you need to load the WHO Resistance Catalog within `tb-profiler`; basically the [instructions](https://github.com/TORCH-Consortium/MAGMA/blob/master/conda_envs/setup_conda_envs.sh#L20-L23), which are used to build the necessary containers.
Next, you need to load the WHO Resistance Catalog within `tb-profiler`; basically the [instructions](https://github.com/TORCH-Consortium/MAGMA/blob/master/conda_envs/setup_conda_envs.sh#L20-L23), which are used to build the necessary containers.

1. Download [magma_resistance_db_who_v1.zip](https://github.com/TORCH-Consortium/MAGMA/files/14559680/resistance_db_who_v1.zip) and unzip it

Expand Down Expand Up @@ -250,7 +250,7 @@ We provide [two docker containers](https://github.com/orgs/TORCH-Consortium/pack

> 🚧 **Container build script**: The script used to build these containers is provided [here](./containers/build.sh).

Although, you don't need to pull the containers manually, but should you need to, you could use the following commands to pull the pre-built and provided containers
Although, you don't need to pull the containers manually, but should you need to, you could use the following commands to pull the pre-built and provided containers

```console
docker pull ghcr.io/torch-consortium/magma/magma-container-1:1.1.1
Expand All @@ -262,13 +262,13 @@ docker pull ghcr.io/torch-consortium/magma/magma-container-2:1.1.1
> :memo: **Have singularity or podman instead?**: <br>
If you do have access to Singularity or Podman, then owing to their compatibility with Docker, you can still use the provided docker containers.

Here's the command which should be used
Here's the command which should be used

```console
nextflow run 'https://github.com/torch-consortium/magma' \
-params-file my_parameters_2.yml \
-profile docker,pbs \
-r v1.1.1
-params-file my_parameters_2.yml \
-profile docker,pbs \
-r v1.1.1
```

> :bulb: **Hint**: <br>
Expand Down Expand Up @@ -307,7 +307,7 @@ errors. Including these is optional, if unknown or irrelevant,
just fill in with a '1' as shown in example_MAGMA_samplesheet.csv)
```

## (Optional) GVCF datasets
## (Optional) GVCF datasets

We also provide some reference GVCF files which you could use for specific use-cases.

Expand All @@ -319,7 +319,7 @@ containing GVCF reference dataset for ~600 samples is provided for augmenting sm

```
use_ref_gvcf = false
ref_gvcf = "/path/to/FILE.g.vcf.gz"
ref_gvcf = "/path/to/FILE.g.vcf.gz"
ref_gvcf_tbi = "/path/to/FILE.g.vcf.gz.tbi"
```

Expand All @@ -335,7 +335,7 @@ Tim Huepink and Lennert Verboven created an in-depth tutorial of the features of

We have also included a presentation (in PDF format) of the logic and workflow of the MAGMA pipeline as well as posters that have been presented at conferences. Please refer the [docs](./docs) folder.

# Interpretation
# Interpretation

The results directory produced by MAGMA is as follows:

Expand All @@ -347,7 +347,7 @@ The results directory produced by MAGMA is as follows:
└── vcf_files
```

## QC Statistics Directory
## QC Statistics Directory

In this directory you will find files related to the quality control carried out by the MAGMA pipeline. The structure is as follows:

Expand Down Expand Up @@ -412,7 +412,7 @@ MAGMA also notes the presence of all variants in in tier 1 and tier 2 drug resis

- **Phylogeny**

Contains the outputs of the IQTree phylogenetic tree construction.
Contains the outputs of the IQTree phylogenetic tree construction.

> :memo: By default we recommend that you use the **ExDRIncComplex** files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the *Mtb* genome

Expand All @@ -422,7 +422,7 @@ Contains the SNP distance tables.

> :memo: By default we recommend that you use the **ExDRIncComplex** files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the *Mtb* genome

## `vcf_files` Directory
## `vcf_files` Directory

```bash
/path/to/results_dir/vcf_files
Expand Down Expand Up @@ -463,7 +463,7 @@ Contains the SNP distance tables.

> Unfiltered structural variants detected by the MAGMA pipeline

## Libraries Directory
## Libraries Directory

> Contains files related to FASTQ validation and FASTQC analysis

Expand All @@ -472,9 +472,9 @@ Contains the SNP distance tables.
> Contains vcf files for major|minor|structural variants for each individual samples


# Citations
# Citations

The MAGMA paper has been published here: https://doi.org/10.1371/journal.pcbi.1011648
The MAGMA paper has been published here: https://doi.org/10.1371/journal.pcbi.1011648

The XBS variant calling core was published here: https://doi.org/10.1099%2Fmgen.0.000689

Expand Down
65 changes: 65 additions & 0 deletions conf/apptainer.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
/*
* Copyright (c) 2021-2024 MAGMA pipeline authors, see https://doi.org/10.1371/journal.pcbi.1011648
*
* This file is part of MAGMA pipeline, see https://github.com/TORCH-Consortium/MAGMA
*
* For quick overview of GPL-3 license, please refer
* https://www.tldrlegal.com/license/gnu-general-public-license-v3-gpl-3
*
* - You MUST keep this license with original authors in your copy
* - You MUST acknowledge the original source of this software
* - You MUST state significant changes made to the original software
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program . If not, see <http://www.gnu.org/licenses/>.
*/
process {


withName:
'.*SPOTYPING.*' {
container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
}

withName:
'.*RDANALYZER.*' {
container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
}


withName:
'.*TBPROFILER.*' {
container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0--1"
}

withName:
'NTMPROFILER.*' {
container = "ghcr.io/torch-consortium/magma/biocontainer-ntmprofiler:0.4.0"
}

withName:
'ISMAPPER.*|GATK.*|LOFREQ.*|DELLY.*|MULTIQC.*|FASTQC.*|UTILS.*|FASTQ.*|SAMPLESHEET.*' {
container = "ghcr.io/torch-consortium/magma/magma-container-1:2.0.0"
}

withName:
'BWA.*|IQTREE.*|SNPDISTS.*|SNPSITES.*|BCFTOOLS.*|BGZIP.*|SAMTOOLS.*|SNPEFF.*|CLUSTERPICKER.*' {
container = "ghcr.io/torch-consortium/magma/magma-container-2:1.1.1"
}

}


apptainer {
enabled = true
}
14 changes: 12 additions & 2 deletions conf/docker.config
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,18 @@
process {

withName:
'TBPROFILER.*' {
container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0"
'.*SPOTYPING.*' {
container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
}

withName:
'.*RDANALYZER.*' {
container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
}

withName:
'.*TBPROFILER.*' {
container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0--1"
}

withName:
Expand Down
12 changes: 12 additions & 0 deletions conf/singularity.config
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,18 @@
*/
process {


withName:
'.*SPOTYPING.*' {
container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
}

withName:
'.*RDANALYZER.*' {
container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
}


withName:
'TBPROFILER.*' {
container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0"
Expand Down
59 changes: 54 additions & 5 deletions default_params.config
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
Expand Down Expand Up @@ -101,6 +102,24 @@ skip_phylogeny_and_clustering = false //OR true
skip_complex_regions = false //OR true



// Enable execution of MAGMA's tbprofiler container (with who+ database) on
// FASTQ files

skip_ntmprofiler = false // OR true

skip_tbprofiler_fastq = true // OR false

skip_spotyping = true

// Flags for experimental features

//NOTE: NOT working yet
skip_rdanalyzer = true
ref_fasta_rdanalyzer = "${projectDir}/resources/rdanalyzer/RDs30.fasta"



//NOTE: PICK ONE of the following parameters related to IQTREE.
iqtree_standard_bootstrap= false
iqtree_fast_ml_only= false
Expand Down Expand Up @@ -186,7 +205,7 @@ fastq_validator_path = "fastq_validator.sh"


//NOTE:Control the global publishing behavior, which is used as default in case there is no process specific config provided
save_mode = 'symlink'
save_mode = 'symlink' // 'copy'
should_publish = true

//NOTE: If enabled, the BAM results from HaplotypeCaller processes would be published
Expand Down Expand Up @@ -371,7 +390,7 @@ DELLY_CALL {
}

NTMPROFILER_PROFILE {
results_dir = "${params.outdir}/non-tuberculous_mycobacteria/per_sample/"
results_dir = "${params.outdir}/analyses/non-tuberculous_mycobacteria/per_sample/"
}


Expand Down Expand Up @@ -443,7 +462,7 @@ UTILS_MERGE_COHORT_STATS {
//-----------------------

NTMPROFILER_COLLATE {
results_dir = "${params.outdir}/non-tuberculous_mycobacteria/cohort"
results_dir = "${params.outdir}/analyses/non-tuberculous_mycobacteria/cohort"

prefix = "ntmprofiler.collate"
}
Expand All @@ -461,7 +480,7 @@ GATK_GENOTYPE_GVCFS {

arguments = " -G StandardAnnotation -G AS_StandardAnnotation --sample-ploidy 1 "

should_publish = false
should_publish = true
}


Expand All @@ -470,7 +489,7 @@ SNPEFF {

arguments = " -nostats -ud 100 Mycobacterium_tuberculosis_h37rv "

should_publish = false
should_publish = true
}


Expand Down Expand Up @@ -678,6 +697,36 @@ TBPROFILER_COLLATE__COHORT {
prefix = "major_variants"
}


TBPROFILER_FASTQ_PROFILE {
results_dir = "${params.outdir}/analyses/others/per_sample/tbprofiler_fastq/"
arguments = "--csv"
should_publish = false
}

TBPROFILER_FASTQ_COLLATE {
results_dir = "${params.outdir}/analyses/drug_resistance/tbprofiler_fastq/"
prefix = "fastq"
}


SPOTYPING {
results_dir = "${params.outdir}/analyses/spotyping/results_excel"
arguments = "" // Or "--noQuery"
}

UTILS_CAT_SPOTYPING {
results_dir = "${params.outdir}/analyses/spotyping/"
arguments = ""
}


RDANALYZER {
results_dir = "${params.outdir}/analyses/others/per_sample/rdanalyzer/"
arguments = ""
}


TBPROFILER_VCF_PROFILE__LOFREQ {
results_dir = "${params.outdir}/analyses/drug_resistance/minor_variants_lofreq/"
arguments = " --depth 0,0 --af 0,0 --strand 0 --sv_depth 0,0 --sv_af 0,0 --sv_len 100000,50000 "
Expand Down
Loading

0 comments on commit ee4fece

Please sign in to comment.