Skip to content
This repository has been archived by the owner on Jan 27, 2020. It is now read-only.

Commit

Permalink
Merge pull request #709 from SciLifeLab/dev
Browse files Browse the repository at this point in the history
Release 2.2.2
  • Loading branch information
maxulysse authored Dec 19, 2018
2 parents 35e7f70 + 5efafce commit 88921e6
Show file tree
Hide file tree
Showing 46 changed files with 3,267 additions and 383 deletions.
12 changes: 8 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,23 @@ env:
- NXF_VER=0.32.0
matrix:
- TEST=SOMATIC
- TEST=GERMLINE
- TEST=ANNOTATEVEP
- TEST=ANNOTATESNPEFF
- TEST=GERMLINE

before_install:
# PRs to master are only ok if coming from dev branch
- '[ $TRAVIS_PULL_REQUEST = "false" ] || [ $TRAVIS_BRANCH != "master" ] || ([ $TRAVIS_PULL_REQUEST_SLUG = $TRAVIS_REPO_SLUG ] && [ $TRAVIS_PULL_REQUEST_BRANCH = "dev" ])'
# Donwload containers
- "travis_retry ./scripts/containers.sh --profile docker --test $TEST"

install:
# Install Nextflow
- curl -fsSL get.nextflow.io | bash
- chmod +x nextflow
- sudo mv nextflow /usr/local/bin/
# Donwload big containers for ANNOTATEVEP and ANNOTATESNPEF tests)
- "travis_retry ./scripts/containers.sh --profile docker --test $TEST"

# Build references when needed
# Build references if needed
before_script: "./scripts/test.sh --profile docker --test $TEST --build"

# Actual tests
Expand Down
35 changes: 35 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,41 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [2.2.2] - 2018-12-19

### `Added`

- [#671](https://github.com/SciLifeLab/Sarek/pull/671) - New `publishDirMode` param and docs
- [#673](https://github.com/SciLifeLab/Sarek/pull/673), [#675](https://github.com/SciLifeLab/Sarek/pull/675), [#676](https://github.com/SciLifeLab/Sarek/pull/676) - Profiles for BinAC and CFC clusters in Tübingen
- [#679](https://github.com/SciLifeLab/Sarek/pull/679) - Add container for `CreateIntervalBeds`
- [#692](https://github.com/SciLifeLab/Sarek/pull/692), [#697](https://github.com/SciLifeLab/Sarek/pull/697) - Add AWS iGenomes possibilities (within `conf/igenomes.conf`)
- [#694](https://github.com/SciLifeLab/Sarek/pull/694) - Add monochrome and grey logos for light or dark background
- [#698](https://github.com/SciLifeLab/Sarek/pull/698) - Add btb profile for munin server
- [#702](https://github.com/SciLifeLab/Sarek/pull/702) - Add font-ttf-dejavu-sans-mono `2.37` and fontconfig `2.12.6` to container

### `Changed`

- [#678](https://github.com/SciLifeLab/Sarek/pull/678) - Changing VEP to v92 and adjusting CPUs for VEP
- [#663](https://github.com/SciLifeLab/Sarek/pull/663) - Update `do_release.sh` script
- [#671](https://github.com/SciLifeLab/Sarek/pull/671) - publishDir modes are now params
- [#677](https://github.com/SciLifeLab/Sarek/pull/677), [#698](https://github.com/SciLifeLab/Sarek/pull/698), [#703](https://github.com/SciLifeLab/Sarek/pull/703) - Update docs
- [#679](https://github.com/SciLifeLab/Sarek/pull/679) - Update old awsbatch configuration
- [#682](https://github.com/SciLifeLab/Sarek/pull/682) - Specifications for memory and cpus for awsbatch
- [#693](https://github.com/SciLifeLab/Sarek/pull/693) - Qualimap bamQC is now ran after mapping and after recalibration for better QC
- [#700](https://github.com/SciLifeLab/Sarek/pull/700) - Update GATK to `4.0.9.0`
- [#702](https://github.com/SciLifeLab/Sarek/pull/702) - Update FastQC to `0.11.8`
- [#705](https://github.com/SciLifeLab/Sarek/pull/705) - Change `--TMP_DIR` by `--tmp-dir` for GATK `4.0.9.0` BaseRecalibrator
- [#706](https://github.com/SciLifeLab/Sarek/pull/706) - Update TravisCI testing

### `Fixed`

- [#665](https://github.com/SciLifeLab/Sarek/pull/665) - Input bam file now has always the same name (whether it is from a single fastq pair or multiple) in the MarkDuplicates process, so metrics too
- [#672](https://github.com/SciLifeLab/Sarek/pull/672) - process `PullSingularityContainers` from `buildContainers.nf` now expect a file with the correct `.simg` extension for singularity images, and no longer the `.img` one.
- [#679](https://github.com/SciLifeLab/Sarek/pull/679) - Add publishDirMode for `germlineVC.nf`
- [#700](https://github.com/SciLifeLab/Sarek/pull/700) - Fix [#699](https://github.com/SciLifeLab/Sarek/issues/699) missing DP in the FORMAT column VCFs for MuTect2
- [#702](https://github.com/SciLifeLab/Sarek/pull/702) - Fix [#701](https://github.com/SciLifeLab/Sarek/issues/701)
- [#705](https://github.com/SciLifeLab/Sarek/pull/705) - Fix [#704](https://github.com/SciLifeLab/Sarek/issues/704)

## [2.2.1] - 2018-10-04

### `Changed`
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ LABEL \

COPY environment.yml /
RUN conda env create -f /environment.yml && conda clean -a
ENV PATH /opt/conda/envs/sarek-2.2.1/bin:$PATH
ENV PATH /opt/conda/envs/sarek-2.2.2/bin:$PATH
2 changes: 1 addition & 1 deletion Sarek-data
4 changes: 2 additions & 2 deletions Singularity
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ Bootstrap:docker
%labels
MAINTAINER Maxime Garcia <[email protected]>
DESCRIPTION Singularity image containing all requirements for the Sarek pipeline
VERSION 2.1.0
VERSION 2.2.2

%environment
PATH=/opt/conda/envs/sarek-2.2.1/bin:$PATH
PATH=/opt/conda/envs/sarek-2.2.2/bin:$PATH
export PATH

%files
Expand Down
30 changes: 19 additions & 11 deletions annotate.nf
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,13 @@ if (params.help) exit 0, helpMessage()
if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information"
if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project <UPPMAX Project ID>"

// Check for awsbatch profile configuration
// make sure queue is defined
if (workflow.profile == 'awsbatch') {
if(!params.awsqueue) exit 1, "Provide the job queue for aws batch!"
}


tools = params.tools ? params.tools.split(',').collect{it.trim().toLowerCase()} : []
annotateTools = params.annotateTools ? params.annotateTools.split(',').collect{it.trim().toLowerCase()} : []
annotateVCF = params.annotateVCF ? params.annotateVCF.split(',').collect{it.trim()} : []
Expand Down Expand Up @@ -103,7 +110,7 @@ vcfForVep = vcfForVep.map {
process RunBcftoolsStats {
tag {vcf}

publishDir directoryMap.bcftoolsStats, mode: 'link'
publishDir directoryMap.bcftoolsStats, mode: params.publishDirMode

input:
set variantCaller, file(vcf) from vcfForBCFtools
Expand All @@ -124,7 +131,7 @@ if (params.verbose) bcfReport = bcfReport.view {
process RunVcftools {
tag {vcf}

publishDir directoryMap.vcftools, mode: 'link'
publishDir directoryMap.vcftools, mode: params.publishDirMode

input:
set variantCaller, file(vcf) from vcfForVCFtools
Expand All @@ -145,10 +152,10 @@ if (params.verbose) vcfReport = vcfReport.view {
process RunSnpeff {
tag {"${variantCaller} - ${vcf}"}

publishDir params.outDir, mode: 'link', saveAs: {
if (it == "${vcf.simpleName}_snpEff.csv") "${directoryMap.snpeffReports}/${it}"
publishDir params.outDir, mode: params.publishDirMode, saveAs: {
if (it == "${vcf.simpleName}_snpEff.csv") "${directoryMap.snpeffReports.minus(params.outDir+'/')}/${it}"
else if (it == "${vcf.simpleName}_snpEff.ann.vcf") null
else "${directoryMap.snpeff}/${it}"
else "${directoryMap.snpeff.minus(params.outDir+'/')}/${it}"
}

input:
Expand Down Expand Up @@ -198,8 +205,8 @@ if('merge' in tools) {
process RunVEP {
tag {"${variantCaller} - ${vcf}"}

publishDir params.outDir, mode: 'link', saveAs: {
if (it == "${vcf.simpleName}_VEP.summary.html") "${directoryMap.vep}/${it}"
publishDir params.outDir, mode: params.publishDirMode, saveAs: {
if (it == "${vcf.simpleName}_VEP.summary.html") "${directoryMap.vep.minus(params.outDir+'/')}/${it}"
else null
}

Expand All @@ -215,13 +222,14 @@ process RunVEP {
script:
finalannotator = annotator == "snpeff" ? 'merge' : 'vep'
genome = params.genome == 'smallGRCh37' ? 'GRCh37' : params.genome
cache_version = params.genome == 'GRCh38' || params.genome == 'iGRCh38' ? 92 : 91
"""
/opt/vep/src/ensembl-vep/vep --dir /opt/vep/.vep/ \
-i ${vcf} \
-o ${vcf.simpleName}_VEP.ann.vcf \
--assembly ${genome} \
--cache \
--cache_version 91 \
--cache_version ${cache_version} \
--database \
--everything \
--filter_common \
Expand All @@ -245,7 +253,7 @@ vcfToCompress = snpeffVCF.mix(vepVCF)
process CompressVCF {
tag {"${annotator} - ${vcf}"}

publishDir "${directoryMap."$finalannotator"}", mode: 'link'
publishDir "${directoryMap."$finalannotator"}", mode: params.publishDirMode

input:
set annotator, variantCaller, file(vcf) from vcfToCompress
Expand All @@ -268,14 +276,14 @@ if (params.verbose) vcfCompressedoutput = vcfCompressedoutput.view {
}

process GetVersionSnpeff {
publishDir directoryMap.version, mode: 'link'
publishDir directoryMap.version, mode: params.publishDirMode
output: file("v_*.txt")
when: 'snpeff' in tools || 'merge' in tools
script: QC.getVersionSnpEFF()
}

process GetVersionVEP {
publishDir directoryMap.version, mode: 'link'
publishDir directoryMap.version, mode: params.publishDirMode
output: file("v_*.txt")
when: 'vep' in tools || 'merge' in tools
script: QC.getVersionVEP()
Expand Down
10 changes: 8 additions & 2 deletions buildContainers.nf
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,12 @@ if (params.help) exit 0, helpMessage()
if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information"
if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project <UPPMAX Project ID>"

// Check for awsbatch profile configuration
// make sure queue is defined
if (workflow.profile == 'awsbatch') {
if(!params.awsqueue) exit 1, "Provide the job queue for aws batch!"
}

// Define containers to handle (build/push or pull)
containersList = defineContainersList()
containers = params.containers.split(',').collect {it.trim()}
Expand Down Expand Up @@ -86,13 +92,13 @@ if (params.verbose) containersBuilt = containersBuilt.view {
process PullSingularityContainers {
tag {"${params.repository}/${container}:${params.tag}"}

publishDir "${params.containerPath}", mode: 'move'
publishDir "${params.containerPath}", mode: params.publishDirMode

input:
val container from singularityContainers

output:
file("${container}-${params.tag}.img") into imagePulled
file("${container}-${params.tag}.simg") into imagePulled

when: params.singularity

Expand Down
14 changes: 10 additions & 4 deletions buildReferences.nf
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,12 @@ if (params.help) exit 0, helpMessage()
if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information"
if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project <UPPMAX Project ID>"

// Check for awsbatch profile configuration
// make sure queue is defined
if (workflow.profile == 'awsbatch') {
if(!params.awsqueue) exit 1, "Provide the job queue for aws batch!"
}

ch_referencesFiles = Channel.fromPath("${params.refDir}/*")

/*
Expand Down Expand Up @@ -103,7 +109,7 @@ ch_notCompressedfiles
process BuildBWAindexes {
tag {f_reference}

publishDir params.outDir, mode: 'link'
publishDir params.outDir, mode: params.publishDirMode

input:
file(f_reference) from ch_fastaForBWA
Expand All @@ -125,7 +131,7 @@ if (params.verbose) bwaIndexes.flatten().view {
process BuildReferenceIndex {
tag {f_reference}

publishDir params.outDir, mode: 'link'
publishDir params.outDir, mode: params.publishDirMode

input:
file(f_reference) from ch_fastaReference
Expand All @@ -149,7 +155,7 @@ if (params.verbose) ch_referenceIndex.view {
process BuildSAMToolsIndex {
tag {f_reference}

publishDir params.outDir, mode: 'link'
publishDir params.outDir, mode: params.publishDirMode

input:
file(f_reference) from ch_fastaForSAMTools
Expand All @@ -170,7 +176,7 @@ if (params.verbose) ch_samtoolsIndex.view {
process BuildVCFIndex {
tag {f_reference}

publishDir params.outDir, mode: 'link'
publishDir params.outDir, mode: params.publishDirMode

input:
file(f_reference) from ch_vcfFile
Expand Down
50 changes: 43 additions & 7 deletions conf/aws-batch.config
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,55 @@
*/

params {
genome_base = params.genome == 'GRCh37' ? "s3://caw-references/grch37" : params.genome == 'GRCh38' ? "s3://caw-references/grch38" : "s3://caw-references/smallgrch37"
genome_base = params.genome == 'GRCh37' ? "s3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh37" : params.genome == 'GRCh38' ? "s3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38" : "s3://sarek-references/small"
publishDirMode = 'copy'
singleCPUMem = 7.GB // To make the uppmax slurm copy paste work.
localReportDir = 'Reports'
}

executor.name = 'awsbatch'
executor.awscli = '/home/ec2-user/miniconda/bin/aws'
executor {
name = 'awsbatch'
awscli = '/home/ec2-user/miniconda/bin/aws'
}

/* Rolling files are currently not supported on s3 */
report.file = "${params.localReportDir}/Sarek_report.html"
timeline.file = "${params.localReportDir}/Sarek_timeline.html"
dag.file = "${params.localReportDir}/Sarek_DAG.svg"
trace.file = "${params.localReportDir}/Sarek_trace.txt"

process {
executor = 'awsbatch'
queue = 'caw-job-queue'
queue = params.awsqueue

errorStrategy = {task.exitStatus == 143 ? 'retry' : 'terminate'}
maxErrors = '-1'
maxRetries = 2
maxRetries = 4
cpus = 2
memory = 7.GB
memory = 8.GB

withName:RunBcftoolsStats {
cpus = 1
memory = {params.singleCPUMem * 2} // Memory is doubled so that it won't run two on the same instance
// Use a tiny queue for this one, so storage doesn't run out
queue = params.awsqueue_tiny
}
withName:RunVcftools {
cpus = 1
memory = {params.singleCPUMem * 2} // Memory is doubled so that it won't run two on the same instance
// Use a tiny queue for this one, so storage doesn't run out
queue = params.awsqueue_tiny
}
withName:RunHaplotypecaller {
cpus = 1
// Increase memory quadratically
memory = {params.singleCPUMem * 2} // Memory is doubled so that it won't run two on the same instance
// Use a tiny queue for this one, so storage doesn't run out
queue = params.awsqueue_tiny
}
withName:RunGenotypeGVCFs {
cpus = 1
memory = {params.singleCPUMem * 2} // Memory is doubled so that it won't run two on the same instance
// Use a tiny queue for this one, so storage doesn't run out
queue = params.awsqueue_tiny
}
}
Loading

0 comments on commit 88921e6

Please sign in to comment.