Skip to content

Commit

Permalink
Merge pull request #65 from JD2112/v2.0
Browse files Browse the repository at this point in the history
v2.0 - Pangolin container separated from Illumina and Nanopore workflows
  • Loading branch information
JD2112 authored Aug 4, 2022
2 parents 05c412d + 94180d7 commit 3c0533b
Show file tree
Hide file tree
Showing 160 changed files with 50,638 additions and 1,392 deletions.
15 changes: 15 additions & 0 deletions .github/workflows/black-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: black
on: pull_request
jobs:
black:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v1
with:
python-version: 3.9
- run: |
python -m pip install --upgrade pip
pip install git+https://github.com/psf/black
- run: |
black --check --verbose .
4 changes: 2 additions & 2 deletions .github/workflows/build_dockerfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ on:
- cron: '0 0 * * *'
push:
branches:
- update_nanopore_container
- update_nanopore_container
jobs:
get-version:
runs-on: ubuntu-latest
Expand Down Expand Up @@ -78,7 +78,7 @@ jobs:
#Build docker for nanopore
docker build --no-cache -f environments/nanopore/Dockerfile -t genomicmedicinesweden/gms-artic-nanopore:latest -t genomicmedicinesweden/gms-artic-nanopore:${{ steps.date.outputs.date }}-p-${REPO_VER}-d-${pangolin_data_VER}-c-${constellations_VER}-s-${scorpio_VER} .
#Build docker for pangolin-check for specific requirements
docker build --no-cache -f environments/nanopore/pangolin/Dockerfile -t genomicmedicinesweden/gms-artic-pangolin:latest -t genomicmedicinesweden/gms-artic-pangolin:${{ steps.date.outputs.date }}-p-${REPO_VER}-d-${pangolin_data_VER}-c-${constellations_VER}-s-${scorpio_VER} --build-arg PANGOLIN_VER=v${REPO_VER} .
docker build --no-cache -f environments/pangolin/Dockerfile -t genomicmedicinesweden/gms-artic-pangolin:latest -t genomicmedicinesweden/gms-artic-pangolin:${{ steps.date.outputs.date }}-p-${REPO_VER}-d-${pangolin_data_VER}-c-${constellations_VER}-s-${scorpio_VER} --build-arg PANGOLIN_VER=v${REPO_VER} .
- name: Push Docker image to DockerHub
shell: bash
Expand Down
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
.DS_Store
.nextflow*
nextflow
results
*.sif
work
environments/.DS_Store
.idea/
.DS_Store
63 changes: 63 additions & 0 deletions .v2releaseprocesses
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
gms-artic v2.0 release workflow processes

Nanopore medaka processes
1. versions
2. pangoversions
3. fastqcNanopore
4. multiqcNanopore
5. articDownloadScheme
6. articGuppyPlex
7. articMinIONMedaka
8. articRemoveUnmappedReads
9. makeQCCSV
10. writeQCSummaryCSV
11. collateSamples -
12. pangolinTyping
13. nextclade
14. getVariantDefinitions
15. makeReport

Nanopore nanopolish processes
1. versions
2. pangoversions
3. fastqcNanopore
4. multiqcNanopore
5. pycoqc
6. articDownloadScheme
7. articGuppyPlex
8. articMinIONNanopolish
9. articRemoveUnmappedReads
10. makeQCCSV
11. writeQCSummaryCSV
12. collateSamples
13. pangolinTyping
14. nextclade
15. getVariantDefinitions
16. makeReport

Illumina processes
1. articDownloadScheme
2. indexReference
3. versions
4. pangoversions
5. fastqc
6. readTrimming
7. readMapping
8. flagStat
9. trimPrimerSequences
10. depth
11. callConsensusFreebayes
12. annotationVEP
13. callVariants
14. makeConsensus
15. makeQCCSV
16. writeQCSummaryCSV
17. statsCoverage
18. statsInsert
19. statsAlignment
20. multiqc
21. collateSamples
22. pangolinTyping
23. nextclade
24. getVariantDefinitions
25. makeReport
170 changes: 133 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,160 @@
# GMS-artic (ncov2019-artic-nf)

![logo](workflow-image/logo.png)

A nextflow pipeline with a GMS touch for running the ARTIC network's fieldbioinformatics tools (https://github.com/artic-network/fieldbioinformatics).

### Table of contents -
- [Version updates](#Version-updates)
- [Pipeline Diagram](#Pipeline-Diagram)
- [Requirements](#Requirements)
- [Quick start guide](#Quick-Start-Guide)
- [parameters setup](#Parameters-setup)
- [Test Data](#Test-Data)
- [Run on local server](#-Run-on-local-server)
- [Requirements](#Requirements)
- [Illumina pipeline](#Illumina-pipeline)
- [Nanopore nanopolish pipeline](#Nanopore-nanopolish-pipeline)
- [Nanopore medaka pipeline](#Nanopore-medaka-pipeline)
- [How to run in NGP server](#How-to-run-in-NGP-server)
- [Datafile structure](#Datafile-structure)
- [Pipeline run command](#Manual-running-of-analysis-pipeline)
- [Illumina pipeline](#Run-Illumina-pipeline)
- [Nanopore pipeline](#Run-Nanopore-Pipeline)
- [Useful information](#Useful-information)
------------
#### Major changes
# Version updates
## v2.0.0
### Major updates
- Docker container separated for Pangolin typing
- Illumina container: [gms-artic-illumina](https://hub.docker.com/repository/docker/genomicmedicinesweden/gms-artic-illumina)
- Nanopore container: [gms-artic-nanopore](https://hub.docker.com/repository/docker/genomicmedicinesweden/gms-artic-nanopore)
- Pangolin container: [gms-artic-pangolin](https://hub.docker.com/repository/docker/genomicmedicinesweden/gms-artic-pangolin)
- pycoQC container : [pycoqc](https://hub.docker.com/repository/docker/jd21/pycoqc)
- Added separate package version files for each workflow
- versions: for Illumina and Nanopore
- pangoversion: for pangolin typing
- Illumina analysis additional features
- flagstat
- depth
- VEP annotation
- Illumina results works for sc2reporter visualization
- Nanopore analysis additional features (artic & medaka)
- [fastqc](https://github.com/s-andrews/FastQC)
- [multiqc](https://multiqc.info)
- [pycoQC](https://github.com/a-slide/pycoQC) *(only for artic)*

## v1.8.0
### Minor updates

- Pangolin v4 support
- Updated Picard arguments
- FastQC commands can be added from config
- Added version of pangolin to build_dockerfile

### Bug fixes
- Fixed build_dockerfile
- Fixed R issue
- Fixed mamba issue

### Major changes
* The illumina and nanopore tracks automatically run pangolin and nextclade.
* Generates report for base changes.

###### 1. gms-artic in ngp-gms
# Pipeline Diagram
![gms-artic package](workflow-image/GMS-Artic_workflow.png)

Find DAG and other figures [here](workflow-image/)

# Requirements
- Nextflow version >=20.10, <22.0 (tested OK on NextFlow version 20.10.0, version 21.10.6)
- Singularity version 3.7.1 (tested OK)
- Conda version >= 4.13.0 (tested OK)

# Quick Start Guide
## Test Data
To test the pipeline, an [example dataset](./.github/data) for both Illumina and Nanopore (nanopolish, medaka) datafiles (from ConnerLab) provided.

# parameters setup
## primer scheme
##### --scheme: To use the primer list, add --scheme to the CLI, eg., use 'nCoV-2019/V3' for artic primers or 'midnight-primer/V1'

*for nanopore analysis (default is "midnight")*
```
sample_name
|___ fast5_pass/
|___ fastq_pass/
|___ sequencing_summary.txt
```
*for illumina analysis*
```
sample_name
|___ fastq/
--scheme nCoV-2019/V3/
--scheme midnight-primers/V1/
--scheme eden-primers/V1/
```
#### Manual running of analysis pipeline
###### 2. Run Illumina pipeline
**To run the artic pipeline, please change the [nanopore.config](https://github.com/JD2112/gms-artic/blob/master/conf/nanopore.config) 'min_length' (default = 400) and 'max_length' (default = 700)**

**For more parameters setup, please see the [ConnerLab documentation](ConnerLab-README.md)**

## Run on local server
### Requirements
1. Containers: [Singularity](https://singularity-tutorial.github.io/01-installation/), [Docker](https://docs.docker.com/engine/install/)
2. [Nextflow>=20](https://www.nextflow.io/docs/latest/getstarted.html)

### Illumina pipeline
```
$ nextflow run main.nf -profile singularity,sge \
nextflow run main.nf -profile singularity \
--illumina --prefix "test_illumina" \
--directory .github/data/fastqs/ \
--outdir illumina_test
```

###### 3. Run Nanopore Pipeline
###### **Deafult is "midnight" protocol**
### Nanopore nanopolish pipeline
```
$ nextflow run main.nf -profile singularity \
--nanopolish --prefix "midnight" \
--basecalled_fastq /home/test/fastq_pass/ \
--fast5_pass /home/test/fast5_pass/ \
--sequencing_summary /home/test/sequencing_summary_FAP82331_657703c9.txt \
--scheme-directory midnight-primer/V1/ \
--outdir /home/test/midnight_test -with-report midnight
nextflow run main.nf -profile singularity \
--nanopolish --prefix "test_nanopore_nanopolish" \
--basecalled_fastq .github/data/nanopore/20200311_1427_X1_FAK72834_a3787181/fastq_pass/ \
--fast5_pass .github/data/nanopore/20200311_1427_X1_FAK72834_a3787181/fast5_pass/ \
--sequencing_summary .github/data/nanopore/20200311_1427_X1_FAK72834_a3787181/sequencing_summary_FAK72834_298b7829.txt \
--outdir nanopore_nanopolish
```

###### --scheme: To use the primer list, add --scheme to the CLI, eg., use 'nCoV-2019/V3' for artic primers or 'midnight-primer/V1'

#### Nanopore medaka pipeline
```
--scheme nCoV-2019/V3/
--scheme midnight-primers/V1/
--scheme eden-primers/V1/
nextflow run main.nf -profile singularity \
--medaka --prefix "test_nanopore_medaka" \
--basecalled_fastq .github/data/nanopore/20200311_1427_X1_FAK72834_a3787181/fastq_pass/ \
--outdir nanopore_medaka
```
###### **To run the artic pipeline, please change the [nanopore.config](https://github.com/JD2112/gms-artic/blob/master/conf/nanopore.config) 'min_length' (default = 400) and 'max_length' (default = 700)**

## Run on NGP server
### Datafile structure
1. *for Nanopore analysis (default is "midnight")*
```
sample_name
|___ fast5_pass/
|___ fastq_pass/
|___ sequencing_summary.txt
```
$ nextflow run main.nf -profile singularity,sge \
#### Run Nanopolish pipeline
```
nextflow run main.nf -profile singularity,sge \
--nanopolish --prefix "test_nanopore" \
--basecalled_fastq .github/data/nanopore/20200311_1427_X1_FAK72834_a3787181/fastq_pass/ \
--fast5_pass .github/data/nanopore/20200311_1427_X1_FAK72834_a3787181/fast5_pass/ \
--sequencing_summary .github/data/nanopore/20200311_1427_X1_FAK72834_a3787181/sequencing_summary_FAK72834_298b7829.txt \
--outdir nanopore_test
--outdir nanopore_test
```

#### Run medaka pipeline
```
#### To update your container image to the latest version from [dockerhub](https://hub.docker.com/orgs/genomicmedicinesweden/repositories), please delete your local image first before running the analysis pipeline.
nextflow run main.nf -profile singularity,sge \
--medaka --prefix "test_nanopore_medaka" \
--basecalled_fastq .github/data/nanopore/20200311_1427_X1_FAK72834_a3787181/fastq_pass/ \
--outdir nanopore_medaka
```
2. *for Illumina analysis*
```
sample_name
|___ fastq/
```
#### Run Illumina pipeline
```
nextflow run main.nf -profile singularity,sge \
--illumina --prefix "test_illumina" \
--directory .github/data/fastqs/ \
--outdir illumina_test
```


# Useful information
1.To update your container image to the latest version from [dockerhub](https://hub.docker.com/orgs/genomicmedicinesweden/repositories), please delete your local image first before running the analysis pipeline.
9 changes: 5 additions & 4 deletions conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,19 @@ params{
scheme = false
tmpdir = "~/tmp"


// Repo to download your primer scheme from
schemeRepoURL = 'https://github.com/genomic-medicine-sweden/gms-artic.git'
// schemeRepoURL = 'https://github.com/artic-network/primer-schemes.git'
//schemeRepoURL = 'https://github.com/genomic-medicine-sweden/gms-artic.git'
schemeRepoURL = 'https://github.com/jd2112/gms-artic.git'

// Directory within schemeRepoURL that contains primer schemes
schemeDir = 'gms-artic'

// Scheme name
// scheme = 'midnight-primer'
scheme = 'nCoV-2019-primer'

// Scheme version
schemeVersion = 'V1'
schemeVersion = 'V3'

// Run experimental medaka pipeline? Specify in the command using "--medaka"
medaka = false
Expand Down
4 changes: 2 additions & 2 deletions conf/illumina.config
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Illumina specific params

params {

// Repo to download your primer scheme from
schemeRepoURL = 'https://github.com/artic-network/primer-schemes.git'

Expand All @@ -12,8 +13,7 @@ params {

// Scheme version
schemeVersion = 'V3'



// Instead of using the ivar-compatible bed file in the scheme repo, the
// full path to a previously-created ivar bed file. Must also supply
// ref.
Expand Down
1 change: 1 addition & 0 deletions conf/nanopore.config
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ params {
// IF SET TO false THIS WILL USE artic minion DEFAULT (100)
normalise = 500


// Use bwa not minimap2? Specify in the command using "--bwa"
bwa = false

Expand Down
4 changes: 2 additions & 2 deletions environments/illumina/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ FROM continuumio/miniconda3:latest AS condabuild
LABEL authors="Matt Bull" \
description="Docker image containing all requirements for an Illumina ncov2019 pipeline"

COPY environments/extras.yml /extras.yml
COPY environments/illumina/environment.yml /environment.yml
COPY extras.yml /extras.yml
COPY environment.yml /environment.yml
RUN /opt/conda/bin/conda update conda && \
/opt/conda/bin/conda install mamba -c conda-forge && \
/opt/conda/bin/conda update mamba -c conda-forge && \
Expand Down
11 changes: 4 additions & 7 deletions environments/illumina/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,12 @@ channels:
- conda-forge
- bioconda
- defaults
- r
dependencies:
- biopython=1.74
- libxcb
- matplotlib>=3.3.3
- python>=3.7
- bwa=0.7.17=pl5.22.0_2
- bwa=0.7.17
- samtools=1.10
- bcftools=1.10
- trim-galore=0.6.5
Expand All @@ -28,11 +27,9 @@ dependencies:
- fastqc=0.11.9
- multiqc=1.11
- nextclade=1.10.2
- r=3.6.0
- sambamba=0.8.0
- ensembl-vep>=102.0
- conda-forge::r-base
- pip:
- pandas >= 1.1
- scikit-learn >= 0.23.1
- git+https://github.com/cov-lineages/pangolin.git
- git+https://github.com/cov-lineages/constellations.git
- git+https://github.com/cov-lineages/scorpio.git
- git+https://github.com/cov-lineages/pangolin-data.git
Loading

0 comments on commit 3c0533b

Please sign in to comment.