Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to nextclade v3 & update default dataset tags #375

Merged
merged 24 commits into from
Apr 4, 2024
Merged
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
6e34d2d
added new WDL task for nextclade v3. tested w miniwdl. not added to w…
kapsakcj Mar 5, 2024
1507c6e
added common miniwdl output directories to .gitignore
kapsakcj Mar 5, 2024
734700a
update sars-cov-2 nextclade defaults; removed unnecessary nextclade_d…
kapsakcj Mar 6, 2024
c2703bf
updates to nextclade v3 task
kapsakcj Mar 6, 2024
cb84f4c
update theiacov_fasta to use nextclade v3 task. tested successfully w…
kapsakcj Mar 6, 2024
0dac722
update nextclade defaults for non-sc2 organisms. Have not tested at a…
kapsakcj Mar 6, 2024
f93e691
update to nextclade 3.3.1 and implement --verbosity flag for nextclad…
kapsakcj Mar 6, 2024
4c3c21f
updated WDL task for adding samples to nextclade ref tree. tested fin…
kapsakcj Mar 6, 2024
dc08e61
update Sample_to_ref_tree_PHB workflow: removed old inputs and made a…
kapsakcj Mar 6, 2024
cd2afec
updated theiacov_fasta_batch, ilmn pe, ilmn se, and ont to use nextcl…
kapsakcj Mar 7, 2024
477a216
update theiacov_clearlabs to use nextclade_v3. did not test with mini…
kapsakcj Mar 21, 2024
7e8e9ab
Merge remote-tracking branch 'origin/main' into cjk-nextclade-v3
kapsakcj Mar 22, 2024
6982979
fix import path for organism_paramteters subwf in theiacov_clearlabs …
kapsakcj Mar 22, 2024
fbf0b49
shellcheck lied to me. reverting last commit
kapsakcj Mar 22, 2024
303a9b4
update theiacov_fasta CI
kapsakcj Mar 22, 2024
f89b0bd
update theiacov_clearlabs CI
kapsakcj Mar 22, 2024
ef1a6ac
update theiacov_ont CI
kapsakcj Mar 22, 2024
e654a22
re-enable theiacov_illumina_pe and se CI workflows; update them for n…
kapsakcj Mar 22, 2024
728504a
Merge remote-tracking branch 'origin/main' into cjk-nextclade-v3
kapsakcj Mar 28, 2024
6324136
update CI
kapsakcj Mar 28, 2024
2b7470a
nextclade_v3 task: removed unused pcr_primers_csv input; added back i…
kapsakcj Apr 4, 2024
0800fa0
nextclade_addToRefTree task and wf change: remove input-pcr-primers o…
kapsakcj Apr 4, 2024
84d506a
Merge remote-tracking branch 'origin/main' into cjk-nextclade-v3
kapsakcj Apr 4, 2024
7381cb6
corrected input file type for input-ref
kapsakcj Apr 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
added new WDL task for nextclade v3. tested w miniwdl. not added to w…
…orkflows yet
kapsakcj committed Mar 5, 2024
commit 6e34d2d63a39b7170895102ace9a6db096477d6d
63 changes: 63 additions & 0 deletions tasks/taxon_id/task_nextclade.wdl
kevinlibuit marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -60,6 +60,69 @@ task nextclade {
}
}

task nextclade_v3 {
meta {
description: "Nextclade classification of one sample. Leaving optional inputs unspecified will use SARS-CoV-2 defaults."
}
input {
File genome_fasta
File? root_sequence
File? auspice_reference_tree_json
File? qc_config_json
File? gene_annotations_gff
File? pcr_primers_csv
File? virus_properties
String docker = "nextstrain/nextclade:3.3.0" # TODO: copy image to GAR; update default to new docker image hosted no our GAR
String dataset_name
#String dataset_reference
String dataset_tag
Int disk_size = 50
Int memory = 4
Int cpu = 2
}
String basename = basename(genome_fasta, ".fasta")
command <<<
# track version & print to log
nextclade --version | tee NEXTCLADE_VERSION

# --reference no longer used in v3. consolidated into --name and --tag
nextclade dataset get --name="~{dataset_name}" --tag="~{dataset_tag}" -o nextclade_dataset_dir --verbose
set -e

# not necessary to include `--jobs <jobs>` in v3. Nextclade will use all available CPU threads by default. It's fast so I don't think we will need to change unless we see errors
nextclade run \
--input-dataset=nextclade_dataset_dir/ \
~{"--input-root-seq " + root_sequence} \
~{"--input-tree " + auspice_reference_tree_json} \
~{"--input-qc-config " + qc_config_json} \
~{"--input-gene-map " + gene_annotations_gff} \
~{"--input-pcr-primers " + pcr_primers_csv} \
~{"--input-virus-properties " + virus_properties} \
--output-json "~{basename}".nextclade.json \
--output-tsv "~{basename}".nextclade.tsv \
--output-tree "~{basename}".nextclade.auspice.json \
--output-all=. \
"~{genome_fasta}"
>>>
runtime {
docker: "~{docker}"
memory: "~{memory} GB"
cpu: cpu
disks: "local-disk " + disk_size + " SSD"
disk: disk_size + " GB" # TES
dx_instance_type: "mem1_ssd1_v2_x2"
maxRetries: 3
}
output {
String nextclade_version = read_string("NEXTCLADE_VERSION")
File nextclade_json = "~{basename}.nextclade.json"
File auspice_json = "~{basename}.nextclade.auspice.json"
File nextclade_tsv = "~{basename}.nextclade.tsv"
String nextclade_docker = docker
String nextclade_dataset_tag = "~{dataset_tag}"
}
}

task nextclade_output_parser {
meta {
description: "Python and bash codeblocks for parsing the output files from Nextclade."