SSDS nextflow pipeline :

Initial paper: Khil et al. Genome Research 2012

Technical paper: Brick, Pratto et al., Methods in Enzymology 2018

This nextflow pipeline is configured to work on a SLURM-based HPC with modules. It can be relatively easily configured to run on other systems (see nextflow documentation : https://www.nextflow.io/).

Requirements:

Tools / programs :

bedtools 2.25.0
bwa 0.7.17
deeptools 3.0.1
fastqc 0.11.8
fastqtools 0.8
fastxtoolkit 0.0.14
java 1.8.0_92
nextflow 0.30.2
picard 2.9.2
python 2.7
samtools 1.8
sratoolkit 2.9.2
trimgalore 0.4.5
ucsc 365

NOTE: These are the recommended (and tested) version of each tool, however newer / older versions will work in most cases. One exception is older versions of nextflow, which will not work.

Perl modules:

BioPerl
Bio::DB::Sam
Getopt::Long
Math::Round
Statistics::Descriptive
Time::HiRes

Global variables required:

$NXF_PIPEDIR : Path to folder containing SSDSPipeline_1.8.nf

$NXF_GENOMES : Path to folder containing reference genomes for alignment ** This folder requires a very specific structure (see below) **

$SLURM_JOBID : Specifies the temporary subfolder to use (see Temp folder requirements below)

NXF_GENOMES Folder structure

Each reference genome should be contained in a separate folder (i.e. $NXF_GENOMES/mouse_mm10). The sub-structure within this folder should be as follows:

$NXF_GENOMES/<genome>/genome.fa : Genome fasta file

$NXF_GENOMES/<genome>/genome.fa.fai : Index of genome fasta file (samtools faidx)

$NXF_GENOMES/<genome>/genome.dict : Sequence dictionary for genome fasta file (use picard CreateSequenceDictionary)

$NXF_GENOMES/<genome>/BWAIndex/version0.7.10/ : BWA 0.7 index files (should also contain soft links to the three files above)

** NOTE: The genome files MUST be named genome.XXX - other names will cause errors (i.e. mm10.fa / hg19_genome.fa / etc ...)

Temp folder requirements

The pipeline requires a high-level temporary folder called /lscratch. On a SLURM-based HPC, each job is assigned a global id ($SLURM_JOBID) and this is appended to the temp folder name for each process. This can be modified in the config.nf file. Thus, there is a requirement for :

/lscratch folder for temporary files SLURM_JOBID global variable for each HPC job.

Pipeline execution

RUN ON LOCAL MACHINE

nextflow run -c $NXF_PIPEDIR/config.nf -profile local \
    $NXF_PIPEDIR/SSDSPipeline_1.8.nf \
    --fq1 $NXF_PIPEDIR/tests/fastq/ssdsLong.100k.R1.fastq \
    --fq2 $NXF_PIPEDIR/tests/fastq/ssdsLong.100k.R2.fastq \
    --r1Len 36 \
    --r2Len 40 \
    --genome mm10 \
    --name testSSDS1.8 \
    --outdir SSDS1.8_test

RUN ON SLURM CLUSTER

nextflow run -c $NXF_PIPEDIR/config.nf -profile slurm \
    $NXF_PIPEDIR/SSDSPipeline_1.8.nf \
    --fq1 $NXF_PIPEDIR/tests/fastq/ssdsLong.100k.R1.fastq \
    --fq2 $NXF_PIPEDIR/tests/fastq/ssdsLong.100k.R2.fastq \
    --r1Len 36 \
    --r2Len 40 \
    --genome mm10 \
    --name testSSDS1.8 \
    --outdir SSDS1.8_test

TESTS

The tests/fastq folder contains small fastq files from an SSDS experiment in mouse. The test should take a few minutes on a local machine (64Gb RAM, 16 Cores).

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
accessoryFiles/SSDS		accessoryFiles/SSDS
tests		tests
README.md		README.md
SSDSPipeline_1.8.nf		SSDSPipeline_1.8.nf
config.getFQ.nf		config.getFQ.nf
config.nf		config.nf
getFQ.nf		getFQ.nf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSDS nextflow pipeline :

Requirements:

Global variables required:

NXF_GENOMES Folder structure

Temp folder requirements

Pipeline execution

RUN ON LOCAL MACHINE

RUN ON SLURM CLUSTER

TESTS

About

Releases 2

Packages

Languages

kevbrick/SSDSnextflowPipeline

Folders and files

Latest commit

History

Repository files navigation

SSDS nextflow pipeline :

Requirements:

Global variables required:

NXF_GENOMES Folder structure

Temp folder requirements

Pipeline execution

RUN ON LOCAL MACHINE

RUN ON SLURM CLUSTER

TESTS

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages