Installation & Testing

NGS-Pipe is a pipeline for the core analysis of DNA and RNA sequencing samples generated in the context of precision oncology. One of the main design goals is to provide an easy to use and robust toolkit for users with bioinformatic expertise. As any other pipeline, NGS-pipe relies on the underlying software that has to be installed before analysis can be performed. In this section we describe different options how to install the bioinformatic software needed for analysis as well as show how the pipeline can be executed with example data that we provide.

Installation

The pipeline comprises a large number of software tools, spanning from aligners, to quality control tools to variant callers. We believe that there are currently 2 viable options to install/provide the tools on your environment.

Conda

Conda is a package manager that automatically installs software and encapsulates it into an environment. Since a large number of bioinformatic software is available via the bioconda channel that cover the majority of the tools needed in the NGS-pipe we provide conda scripts for DNA and RNA. We recommend the use of conda for the installation.

Manual installation

Installation of tools by hand is also possible but also cumbersome. You will be in charge to find software in the correct version and install it on your own system. Snakemake will require to adjust the path in the config files.

Why not Docker?

We have decided not to integrate our pipeline into docker. Docker is a neat tool to package your software and its dependencies into a simple container. But there are multiple flaws when it comes to executing Docker containers on a HPC environment, such as privilege escalation or performance. The flaws can be fixed e.g. by "translating" the container to Singularity. But in total, the overhead to make a Docker container HPC-ready is similar to the installation of tools by hand and bare metal.

Conda Installation and Examples

A large fraction of tools required by the NGS-Pipe is covered by Conda and the bioconda channel. Installation of tools is performed by a single command.

RNA

Installation of Tools

All tools required for the analysis of RNASeq experiments are provided by conda. The tools will be installed via conda and the environment activated.

#The RNA environment (environments/rna_environment.yaml)
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - fastqc=0.11.5
  - samtools=1.2
  - star=2.5.3a
  - trimmomatic=0.36
  - subread=1.5.2
  - snakemake=3.13.3

#Install tools from rna-environment.yaml
conda env create -n ngs-pipe-rna --file environments/rna_environment.yaml

#Activate environment
conda activate ngs-pipe-rna

After the environment is activated all tools are available via commandline and ready to be executed in the pipeline.

Test Run of RNASeq Pipeline

We provide data and a test script to get familiar with how the raw data has to be formatted and how to execute the pipeline.

#1. Go to examples folder:
cd examples/rna
#2. Download test data: We provide an additional snakemake pipeline to 
#   download test sequences, databases and adapter files:
./run_prepare_data_locally.sh
# This will download 8 test data sets, the adapters, the human reference 
# and build the STAR database index
#3. Execute the RNASeq Pipeline:
./run_analysis_locally.sh
# This will execute: RAW-->Trimmomatic-->STAR-->FeatureCounts

DNA

Installation of Tools

All core tools required for the analysis of DNA sequencing experiments are provided by conda. These tools will be installed via conda and the environment activated. However some tools are not provided by conda and need to be installed by hand (see list below).

#The DNA environment (environments/dna_environment.yaml). 
#The disabled dependencies are not needed for the example data and can be enabled when needed
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - snakemake=3.13.3
  - fastqc=0.11.5
  - samtools=1.4
  - trimmomatic=0.36
  - bwa=0.7.15
  - picard=2.9.2
  - gatk=3.5
  - varscan=2.4.2
  - qualimap=2.2
  - sra-tools=2.8.1
  #- bowtie2=2.3.2
  #- yara=0.9.6
  #- snpeff=4.3
  #- snpsift=4.3
  #- freebayes=1.1.0
  #- somatic-sniper=1.0.5.0
  #- pindel=0.2.5b8
  #- bioconductor-deepsnv=1.20.0
  #- vardict-java=1.4.10
  #- vardict=2017.04.18

#Install tools from dna-environment.yaml
conda env create -n ngs-pipe-dna --file environments/dna_environment.yaml

#Activate environment
conda activate ngs-pipe-dna

After the environment is activated all tools are available via command line and ready to be executed in the pipeline.

Test Run of DNA Pipeline

We provide data and a test script to get familiar with how the raw data has to be formatted and how to execute the pipeline. However, this test script doesn't execute the full pipeline but only a subset due to limitations of tools installable by conda. The full pipeline can be executed once all required tools are installed.

#1. Go to examples folder:
cd examples/dna
#2. Download test data: We provide an additional snakemake pipeline to 
#   download test sequences, databases and adapter files:
./run_prepare_data_locally.sh
# This will download 6 test data sets, the adapters, regions file,
# the human reference and build the BWA database index
#3. Execute the DNA Pipeline:
./run_analysis_locally.sh
# This will execute: RAW --> QC(Trimmomatic) --> Mapping(BWA) --> Sort(Picard)
# --> Merge(Picard) --> Remove Secondary Alignments(Samtools) --> MarkDuplicates(Picard)
# --> RemoveDuplicates(Samtools) --> SNV Calling (VarScan2)

Tools to be installed by hand

Tool	Optional/Mandatory	Comment	Version
GATK	Mandatory	The jar needs to be registered by conda	3.5
JointSNVMix	Optional		0.75
JointSNVMix2	Optional		current
Seqpurge	Optional		current
mutect	Optional		current
dindel	Optional		current
rankCombineVariants	Optional		current
bicseq2	Optional		current
annovar	Optional		current
facets	Optional		current
somaticseq	Optional		v2.1.2
strelka	Optional	current

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation & Testing

Installation

Conda

Manual installation

Why not Docker?

Conda Installation and Examples

RNA

Installation of Tools

Test Run of RNASeq Pipeline

DNA

Installation of Tools

Test Run of DNA Pipeline

Tools to be installed by hand

Clone this wiki locally