Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hifiasm assembler #70

Merged
merged 14 commits into from
Feb 28, 2024
Merged
37 changes: 37 additions & 0 deletions .github/workflows/test_pr_lreads_docker_ont_hifi.bkp
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: Testing long-reads HIFI / docker (ONT) from PR
on:
pull_request:
branches: [ master, dev ]
types: [ opened, synchronize, reopened ]

jobs:
run_nextflow:
name: Run pipeline for the upcoming PR
runs-on: ubuntu-latest

steps:

- name: Check out pipeline code
uses: actions/checkout@v2

- name: Install Nextflow
env:
CAPSULE_LOG: none
run: |
wget -qO- get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/

- name: Clean environment
run: |
sudo rm -rf /usr/local/lib/android # will release about 10 GB if you don't need Android
sudo rm -rf /usr/share/dotnet # will release about 20GB if you don't need .NET

- name: Run tests for long-reads (ont)
run: |
nextflow run main.nf -profile docker,test,lreads,ont_hifi --max_memory '6.GB' --max_cpus 4
rm -r work .nextflow*

- name: View results
run: |
sudo apt-get install -y tree
tree lreads_test_ont_hifi
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ RUN medaka --help

# fix permissions
USER root
RUN mkdir -p $CONDA_PREFIX/envs/mpgap-3.2/lib/python3.8/site-packages/medaka && \
chmod -R 777 $CONDA_PREFIX/envs/mpgap-3.2/lib/python3.8/site-packages/medaka
RUN mkdir -p $CONDA_PREFIX/envs/mpgap-3.2/lib/python3.9/site-packages/medaka && \
chmod -R 777 $CONDA_PREFIX/envs/mpgap-3.2/lib/python3.9/site-packages/medaka

# pre-download BUSCO bacteria database
RUN mkdir -p /opt/busco_db/ && \
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ This pipeline wraps up the following software:

|| **Source** |
|:- | :- |
| **Assemblers** | [Canu](https://github.com/marbl/canu), [Flye](https://github.com/fenderglass/Flye), [Raven](https://github.com/lbcb-sci/raven), [Shasta](https://github.com/chanzuckerberg/shasta), [wtdbg2](https://github.com/ruanjue/wtdbg2), [Haslr](https://github.com/vpc-ccg/haslr), [Unicycler](https://github.com/rrwick/Unicycler), [Spades](https://github.com/ablab/spades), [Shovill](https://github.com/tseemann/shovill), [Megahit](https://github.com/voutcn/megahit) |
| **Assemblers** | [Hifiasm](https://github.com/chhylp123/hifiasm), [Canu](https://github.com/marbl/canu), [Flye](https://github.com/fenderglass/Flye), [Raven](https://github.com/lbcb-sci/raven), [Shasta](https://github.com/chanzuckerberg/shasta), [wtdbg2](https://github.com/ruanjue/wtdbg2), [Haslr](https://github.com/vpc-ccg/haslr), [Unicycler](https://github.com/rrwick/Unicycler), [Spades](https://github.com/ablab/spades), [Shovill](https://github.com/tseemann/shovill), [Megahit](https://github.com/voutcn/megahit) |
| **Polishers** | [Nanopolish](https://github.com/jts/nanopolish), [Medaka](https://github.com/nanoporetech/medaka), [gcpp](https://github.com/PacificBiosciences/gcpp), [Polypolish](https://github.com/rrwick/Polypolish) and [Pilon](https://github.com/broadinstitute/pilon) |
| **Quality check** | [Quast](https://github.com/ablab/quast), [BUSCO](https://busco.ezlab.org/busco_userguide.html) and [MultiQC](https://multiqc.info/) |

Expand Down
4 changes: 4 additions & 0 deletions assets/lreads_test_ont_hifi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
samplesheet:
- id: ont_only_hifi
nanopore: https://github.com/fmalmeida/test_datasets/raw/main/SRR27467590.fq.gz
genome_size: 4m
6 changes: 4 additions & 2 deletions conf/defaults.config
Original file line number Diff line number Diff line change
Expand Up @@ -143,10 +143,12 @@ params {
skip_shasta = false // Nanopore longreads only assemblies
shasta_additional_parameters = null // Must be given as shown in shasta manual. E.g. " --Reads.minReadLength 5000 "

skip_hifiasm = false // Longreads only assemblies
hifiasm_additional_parameters = null // Must be given as shown in Hifiasm manual. E.g. " --ul ul.fq.gz "

skip_pilon = false // Skip pilon polisher when performing hybrid assembly strategy 2
skip_polypolish = false // Skip polypolisher polisher when performing hybrid assembly strategy 2


/*
* Resources controlling parameters
*
Expand Down Expand Up @@ -174,4 +176,4 @@ params {
max_cpus = 10
max_time = '40.h'

}
}
2 changes: 1 addition & 1 deletion conf/docker.config
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ docker.runOptions = '-u \$(id -u):\$(id -g)'
fixOwnership = true
process {
withName: '.*' {
container = "fmalmeida/mpgap@sha256:f640835dad87d98ded0582271aafaebe609f9196618f52a46ac10d991a0fce27"
container = "fmalmeida/mpgap@sha256:28223374b5500b09ae467064d825b44d086c99f1ade6afa80dbf8fd0053d760e"
}
}
4 changes: 2 additions & 2 deletions conf/singularity.config
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ singularity.enabled = true
singularity.autoMounts = true
process {
withName: '.*' {
container = "docker://fmalmeida/mpgap@sha256:f640835dad87d98ded0582271aafaebe609f9196618f52a46ac10d991a0fce27"
container = "docker://fmalmeida/mpgap@sha256:28223374b5500b09ae467064d825b44d086c99f1ade6afa80dbf8fd0053d760e"
}
}
}
38 changes: 25 additions & 13 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ profiles {
params {
input = "$baseDir/assets/illumina_test.yml"
output = "sreads_test"
tracedir = "sreads_test/pipeline_info"
tracedir = "${params.output}/pipeline_info"
max_memory = 6.GB
max_cpus = 2
max_time = '6.h'
Expand All @@ -30,16 +30,26 @@ profiles {

ont {
params {
input = "$baseDir/assets/lreads_test_ont.yml"
output = "lreads_test_ont"
tracedir = "lreads_test_ont/pipeline_info"
input = "$baseDir/assets/lreads_test_ont.yml"
output = "lreads_test_ont"
tracedir = "${params.output}/pipeline_info"
skip_hifiasm = true
}
}
ont_hifi {
params {
input = "$baseDir/assets/lreads_test_ont_hifi.yml"
output = "lreads_test_ont_hifi"
tracedir = "${params.output}/pipeline_info"
high_quality_longreads = true
}
}
pacbio {
params {
input = "$baseDir/assets/lreads_test_pacbio.yml"
output = "lreads_test_pacbio"
tracedir = "lreads_test_pacbio/pipeline_info"
input = "$baseDir/assets/lreads_test_pacbio.yml"
output = "lreads_test_pacbio"
tracedir = "${params.output}/pipeline_info"
skip_hifiasm = true
}
}

Expand All @@ -60,16 +70,18 @@ profiles {

ont {
params {
input = "$baseDir/assets/hybrid_test_ont.yml"
output = "hybrid_test_ont"
tracedir = "hybrid_test_ont/pipeline_info"
input = "$baseDir/assets/hybrid_test_ont.yml"
output = "hybrid_test_ont"
tracedir = "${params.output}/pipeline_info"
skip_hifiasm = true
}
}
pacbio {
params {
input = "$baseDir/assets/hybrid_test_pacbio.yml"
output = "hybrid_test_pacbio"
tracedir = "hybrid_test_pacbio/pipeline_info"
input = "$baseDir/assets/hybrid_test_pacbio.yml"
output = "hybrid_test_pacbio"
tracedir = "${params.output}/pipeline_info"
skip_hifiasm = true
}
}

Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The pipeline wraps up the following tools and analyses:

| Software | Analysis |
| :------- | :------- |
| [Canu](https://github.com/marbl/canu), [Flye](https://github.com/fenderglass/Flye), [Unicycler](https://github.com/rrwick/Unicycler), [Raven](https://github.com/lbcb-sci/raven), [Shasta](https://github.com/chanzuckerberg/shasta) and [wtdbg2](https://github.com/ruanjue/wtdbg2) | Long reads assembly |
| [Hifiasm](https://github.com/chhylp123/hifiasm), [Canu](https://github.com/marbl/canu), [Flye](https://github.com/fenderglass/Flye), [Unicycler](https://github.com/rrwick/Unicycler), [Raven](https://github.com/lbcb-sci/raven), [Shasta](https://github.com/chanzuckerberg/shasta) and [wtdbg2](https://github.com/ruanjue/wtdbg2) | Long reads assembly |
| [Haslr](https://github.com/vpc-ccg/haslr), [Unicycler](https://github.com/rrwick/Unicycler) and [SPAdes](https://github.com/ablab/spades) | Hybrid assembly |
| [Shovill](https://github.com/tseemann/shovill), [Unicycler](https://github.com/rrwick/Unicycler), [Megahit](https://github.com/voutcn/megahit) and [SPAdes](https://github.com/ablab/spades) | Short reads assembly |
| [Nanopolish](https://github.com/jts/nanopolish), [Medaka](https://github.com/nanoporetech/medaka), [gcpp](https://github.com/PacificBiosciences/gcpp), [Polypolish](https://github.com/rrwick/Polypolish) and [Pilon](https://github.com/broadinstitute/pilon) | Assembly polishing |
Expand Down
3 changes: 3 additions & 0 deletions docs/manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ The pipeline is capable of assembling Illumina, ONT and Pacbio reads in three ma
+ Raven
+ Shasta
+ wtdbg2
+ hifiasm

3. **Hybrid assemblies (using both short and long reads)**
+ Unicycler
Expand Down Expand Up @@ -162,6 +163,8 @@ However, they can also be set in a sample-specific manner. If a sample has a val
| `--shasta_additional_parameters` | :material-close: | False | Passes additional parameters for Raven assembler. E.g. `" --Assembly.detangleMethod 1 "`. Must be given as shown in Shasta's manual |
| `--skip_wtdbg2` | :material-close: | False | Skip the execution of wtdbg2 |
| `--wtdbg2_additional_parameters` | :material-close: | False | Passes additional parameters for wtdbg2 assembler. E.g. `" -k 250 "`. Must be given as shown in wtdbg2's manual. Remember, the script called for wtdbg2 is `wtdbg2.pl` thus you must give the parameters used by it |
| `--skip_hifiasm` | :material-close: | False | Skip the execution of hifiasm |
| `--hifiasm_additional_parameters` | :material-close: | False | Passes additional parameters for hifiasm assembler. E.g. `" --ul ul.fq.gz "`. Must be given as shown in hifiasm's manual |
| `--skip_unicycler` | :material-close: | False | Skip the execution of Unicycler |
| `--unicycler_additional_parameters` | :material-close: | False | Passes additional parameters for Unicycler assembler. E.g. `" --mode conservative --no_correct "`. Must be given as shown in Unicycler's manual |
| `--skip_spades` | :material-close: | False | Skip the execution of SPAdes |
Expand Down
1 change: 1 addition & 0 deletions docs/non_bacteria.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ nextflow run fmalmeida/mpgap \
--skip_unicycler \
--flye_additional_parameters ' --keep-haplotypes ' \
--quast_additional_parameters ' --eukaryote ' \
--skip_hifiasm \
--max_cpus 20 \
--max_memory '40.GB'
```
Expand Down
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ dependencies:
- bioconda::csvtk=0.23.0
- bioconda::wtdbg=2.5
- bioconda::medaka=1.11.1

- bioconda::hifiasm=0.19.8
# for medaka > 1.4
- bioconda::samtools>=1.11
- bioconda::tabix>=1.11
Expand Down
3 changes: 2 additions & 1 deletion markdown/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,12 @@ The tracking for changes started in v2.
* Increase default `--max_memory` value to 20.GB.
* Add a directory called `final_assemblies` in the main output directory holding all the assemblies generated in the pipeline execution.
* Updated documentation as discussed in [[#58](https://github.com/fmalmeida/MpGAP/issues/58)] and [[#57](https://github.com/fmalmeida/MpGAP/issues/57)].
* [[#61](https://github.com/fmalmeida/MpGAP/issues/61)] - Add a simple parameter to adjust how many cpus and how much memory should the assembly jobs request in the first attempt to avoid lack of resources errors.
* [[#50](https://github.com/fmalmeida/MpGAP/issues/50)]
* Parameters `--skip_pilon` and `--skip_polypolish` added to the pipeline
* MultiQC report was fixed and enhanced
* Docker image was also modified to download BUSCO standalone and pipeline perform the BUSCO standalone run instead of via quast.
* [[#53](https://github.com/fmalmeida/MpGAP/issues/53)] - Include hifiasm assembler in the pipeline. Long reads only and hybrid strategy 2.
* [[#61](https://github.com/fmalmeida/MpGAP/issues/61)] - Add a simple parameter to adjust how many cpus and how much memory should the assembly jobs request in the first attempt to avoid lack of resources errors.
* [[#66](https://github.com/fmalmeida/MpGAP/issues/66)] - Include an automated generation of a samplesheet for bacannot pipeline.

## v3.1.4 -- [2022-Sep-03]
Expand Down
2 changes: 1 addition & 1 deletion markdown/list_of_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@ These are the tools that wrapped inside mpgap. **Cite** the tools whenever you u

|| **Source** |
|:- | :- |
| **Assemblers** | [Canu](https://github.com/marbl/canu), [Flye](https://github.com/fenderglass/Flye), [Raven](https://github.com/lbcb-sci/raven), [Shasta](https://github.com/chanzuckerberg/shasta), [wtdbg2](https://github.com/ruanjue/wtdbg2), [Haslr](https://github.com/vpc-ccg/haslr), [Unicycler](https://github.com/rrwick/Unicycler), [Spades](https://github.com/ablab/spades), [Shovill](https://github.com/tseemann/shovill) |
| **Assemblers** | [Hifiasm](https://github.com/chhylp123/hifiasm), [Canu](https://github.com/marbl/canu), [Flye](https://github.com/fenderglass/Flye), [Raven](https://github.com/lbcb-sci/raven), [Shasta](https://github.com/chanzuckerberg/shasta), [wtdbg2](https://github.com/ruanjue/wtdbg2), [Haslr](https://github.com/vpc-ccg/haslr), [Unicycler](https://github.com/rrwick/Unicycler), [Spades](https://github.com/ablab/spades), [Shovill](https://github.com/tseemann/shovill) |
| **Polishers** | [Nanopolish](https://github.com/jts/nanopolish), [Medaka](https://github.com/nanoporetech/medaka), [gcpp](https://github.com/PacificBiosciences/gcpp), [Polypolish](https://github.com/rrwick/Polypolish) and [Pilon](https://github.com/broadinstitute/pilon) |
| **Quality check** | [Quast](https://github.com/ablab/quast), [BUSCO](https://busco.ezlab.org/busco_userguide.html) and [MultiQC](https://multiqc.info/) |
29 changes: 29 additions & 0 deletions modules/local/LongReads/hifiasm.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 15 additions & 1 deletion nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,13 @@
"hidden": true,
"fa_icon": "fas fa-ban"
},
"skip_hifiasm": {
"type": "boolean",
"description": "Skip Hifiasm assembler",
"help_text": "Hifiasm is a long reads only assembler. Can be use for hybrid assemblies in strategy 2.",
"hidden": true,
"fa_icon": "fas fa-ban"
},
"skip_pilon": {
"type": "boolean",
"description": "Skip pilon polisher",
Expand Down Expand Up @@ -340,6 +347,13 @@
"help_text": "Must be given as shown in shasta manual. E.g. \" --Reads.minReadLength 5000 \", inside quotes and separated by spaces",
"hidden": true,
"fa_icon": "fas fa-quote-left"
},
"hifiasm_additional_parameters": {
"type": "string",
"description": "Hifiasm additional parameters",
"help_text": "Must be giveen as shown in hifiasm manual. E.g. \" --ul ul.fq.gz \", inside quotes and separated by spaces",
"hidden": true,
"fa_icon": "fas fa-quote-left"
}
},
"fa_icon": "fas fa-list-ul"
Expand Down Expand Up @@ -463,4 +477,4 @@
"$ref": "#/definitions/institutional_config_options"
}
]
}
}
17 changes: 16 additions & 1 deletion workflows/hybrid.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 14 additions & 1 deletion workflows/long-reads-only.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading