Skip to content

Commit

Permalink
Merge branch 'dev' of https://github.com/nf-core/airrflow into dev
Browse files Browse the repository at this point in the history
  • Loading branch information
ggabernet committed Jul 18, 2023
2 parents 128cf0b + a687dc2 commit 4adcc54
Show file tree
Hide file tree
Showing 20 changed files with 85 additions and 60 deletions.
22 changes: 21 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,27 @@
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [3.1] - 2023-06-05 "Protego"
## [3.2.0dev] -

### `Added`

- [#268](https://github.com/nf-core/airrflow/pull/268) Added parameters for FindThreshold in `modules.config`.
- [#268](https://github.com/nf-core/airrflow/pull/268) Validate samplesheet also for `assembled` samplesheet.
- [#259](https://github.com/nf-core/airrflow/pull/259) Update to `EnchantR v0.1.3`.

### `Fixed`

- [#268](https://github.com/nf-core/airrflow/pull/268) Allows for uppercase and lowercase locus in samplesheet `pcr_target_locus`.
- [#259](https://github.com/nf-core/airrflow/pull/259) Samplesheet only allows data from one species.
- [#259](https://github.com/nf-core/airrflow/pull/259) Introduced fix for a too long command with hundreds of datasets.

### `Dependencies`

| Dependency | Old version | New version |
| ---------- | ----------- | ----------- |
| r-enchantr | 0.1.2 | 0.1.3 |

## [3.1.0] - 2023-06-05 "Protego"

### `Added`

Expand Down
3 changes: 2 additions & 1 deletion assets/repertoire_comparison.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,8 @@ col_select <- c(
)
df_all <- dplyr::bind_rows(lapply(all_files, read_rearrangement, col_select=col_select))
# Remove underscores in these columns (only needed if including clonal abundance and diversity)
# Remove underscores in these columns
df_all$subject_id <- stringr::str_replace_all(df_all$subject_id, "_", "")
df_all$sample_id <- stringr::str_replace_all(df_all$sample_id , "_", "")
Expand Down
2 changes: 1 addition & 1 deletion bin/check_samplesheet.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ def check_samplesheet(file_in, assembled):
- contains the compulsory fields: sample_id, filename_R1, filename_R2, subject_id, pcr_target_locus, species, single_cell
- sample ids are unique
- samples from the same subject come from the same species
- pcr_target_locus is "IG" or "TR"
- pcr_target_locus is "IG"/"ig" or "TR"/"tr"
- species is "human" or "mouse"
"""

Expand Down
19 changes: 9 additions & 10 deletions bin/fetch_imgt.sh
Original file line number Diff line number Diff line change
Expand Up @@ -68,15 +68,14 @@ do
echo "|---- Ig"
for CHAIN in IGHV IGHD IGHJ IGKV IGKJ IGLV IGLJ
do
URL="http://www.imgt.org/IMGT_GENE-DB/GENElect?query=7.14+${CHAIN}&species=${VALUE}"
URL="https://www.imgt.org/genedb/GENElect?query=7.14+${CHAIN}&species=${VALUE}"
FILE_NAME="${FILE_PATH}/${REPERTOIRE}_${KEY}_${CHAIN}.fasta"
TMP_FILE="${FILE_NAME}.tmp"
#echo $URL
wget $URL -O $TMP_FILE -q
awk '/<pre>/{i++}/<\/pre>/{j++}{if(j==2){exit}}{if(i==2 && j==1 && $0!~"^<pre>"){print}}' $TMP_FILE > $FILE_NAME

# Checking once that file exists and is not empty (checks IMGT server is online)
read file
if [ -s "$FILE_NAME" ]
then
echo "IMGT Fasta file exists and is not empty"
Expand All @@ -93,7 +92,7 @@ do
# V amino acid for Ig
for CHAIN in IGHV IGKV IGLV
do
URL="http://www.imgt.org/IMGT_GENE-DB/GENElect?query=7.3+${CHAIN}&species=${VALUE}"
URL="https://www.imgt.org/genedb/GENElect?query=7.3+${CHAIN}&species=${VALUE}"
FILE_NAME="${FILE_PATH_AA}/${REPERTOIRE}_aa_${KEY}_${CHAIN}.fasta"
TMP_FILE="${FILE_NAME}.tmp"
#echo $URL
Expand All @@ -108,7 +107,7 @@ do
echo "|---- TCR"
for CHAIN in TRAV TRAJ TRBV TRBD TRBJ TRDV TRDD TRDJ TRGV TRGJ
do
URL="http://www.imgt.org/IMGT_GENE-DB/GENElect?query=7.14+${CHAIN}&species=${VALUE}"
URL="https://www.imgt.org/genedb/GENElect?query=7.14+${CHAIN}&species=${VALUE}"
FILE_NAME="${FILE_PATH}/${REPERTOIRE}_${KEY}_${CHAIN}.fasta"
TMP_FILE="${FILE_NAME}.tmp"
#echo $URL
Expand All @@ -121,7 +120,7 @@ do
# V amino acid for TCR
for CHAIN in TRAV TRBV TRDV TRGV
do
URL="http://www.imgt.org/IMGT_GENE-DB/GENElect?query=7.3+${CHAIN}&species=${VALUE}"
URL="https://www.imgt.org/genedb/GENElect?query=7.3+${CHAIN}&species=${VALUE}"
FILE_NAME="${FILE_PATH_AA}/${REPERTOIRE}_aa_${KEY}_${CHAIN}.fasta"
TMP_FILE="${FILE_NAME}.tmp"
#echo $URL
Expand All @@ -140,7 +139,7 @@ do
echo "|---- Ig"
for CHAIN in IGH IGK IGL
do
URL="http://www.imgt.org/IMGT_GENE-DB/GENElect?query=8.1+${CHAIN}V&species=${VALUE}&IMGTlabel=L-PART1+L-PART2"
URL="https://www.imgt.org/genedb/GENElect?query=8.1+${CHAIN}V&species=${VALUE}&IMGTlabel=L-PART1+L-PART2"
FILE_NAME="${FILE_PATH}/${REPERTOIRE}_${KEY}_${CHAIN}L.fasta"
TMP_FILE="${FILE_NAME}.tmp"
#echo $URL
Expand All @@ -154,7 +153,7 @@ do
echo "|---- TCR"
for CHAIN in TRA TRB TRG TRD
do
URL="http://www.imgt.org/IMGT_GENE-DB/GENElect?query=8.1+${CHAIN}V&species=${VALUE}&IMGTlabel=L-PART1+L-PART2"
URL="https://www.imgt.org/genedb/GENElect?query=8.1+${CHAIN}V&species=${VALUE}&IMGTlabel=L-PART1+L-PART2"
FILE_NAME="${FILE_PATH}/${REPERTOIRE}_${KEY}_${CHAIN}L.fasta"
TMP_FILE="${FILE_NAME}.tmp"
#echo $URL
Expand All @@ -179,7 +178,7 @@ do
QUERY=7.5
fi

URL="http://www.imgt.org/IMGT_GENE-DB/GENElect?query=${QUERY}+${CHAIN}&species=${VALUE}"
URL="https://www.imgt.org/genedb/GENElect?query=${QUERY}+${CHAIN}&species=${VALUE}"
FILE_NAME="${FILE_PATH}/${REPERTOIRE}_${KEY}_${CHAIN}.fasta"
TMP_FILE="${FILE_NAME}.tmp"
#echo $URL
Expand All @@ -193,7 +192,7 @@ do
echo "|---- TCR"
for CHAIN in TRAC TRBC TRGC TRDC
do
URL="http://www.imgt.org/IMGT_GENE-DB/GENElect?query=14.1+${CHAIN}&species=${VALUE}"
URL="https://www.imgt.org/genedb/GENElect?query=14.1+${CHAIN}&species=${VALUE}"
FILE_NAME="${FILE_PATH}/${REPERTOIRE}_${KEY}_${CHAIN}.fasta"
TMP_FILE="${FILE_NAME}.tmp"
#echo $URL
Expand All @@ -209,7 +208,7 @@ done

# Write download info
INFO_FILE=${OUTDIR}/IMGT.yaml
echo -e "source: http://www.imgt.org/IMGT_GENE-DB" > $INFO_FILE
echo -e "source: https://www.imgt.org/genedb" > $INFO_FILE
echo -e "date: ${DATE}" >> $INFO_FILE
echo -e "species:" >> $INFO_FILE
for Q in ${SPECIES_QUERY[@]}
Expand Down
7 changes: 6 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,11 @@ process {
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
ext.args = ['findthreshold_method':'gmm',
'findthreshold_model':'gamma-norm',
'findthreshold_edge':0.9,
'findthreshold_cutoff':'user',
'findthreshold_spc':0.995]
}

withName: REPORT_THRESHOLD {
Expand Down Expand Up @@ -450,7 +455,7 @@ process {
]
ext.args = ['outname':'', 'model':'hierarchical',
'method':'nt', 'linkage':'single',
'skip_convergence':true,
'skip_convergence':false,
'outputby':'sample_id', 'min_n':30]
}

Expand Down
10 changes: 5 additions & 5 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,14 +116,14 @@ The required input file for processing raw BCR or TCR bulk targeted sequencing d
- `filename` for bulk assembled data: path to `sequences.fasta` file, containing the assembled and error-corrected reads.


Assembled bulk and single-cell sequencing data can be processed in the same run and can be provided in the same samplesheet as shown below.
The required input file for processing raw BCR or TCR bulk targeted sequencing data is a sample sheet in TSV format (tab separated). The columns `sample_id`, `filename`, `subject_id`, `species`, `tissue`, `single_cell`, `pcr_target_locus`, `sex`, `age` and `biomaterial_provider` are required.

An example samplesheet is:

| filename | species | subject_id | sample_id | tissue | sex | age | biomaterial_provider | pcr_target_locus | single_cell |
| -------------------------------------------------------- | ------- | ---------- | --------------------------------- | ---------- | ---- | --- | -------------------- | ---------------- | ----------- |
| sc5p_v2_hs_PBMC_1k_b_airr_rearrangement.tsv | human | subject_x | sc5p_v2_hs_PBMC_1k_5fb | PBMC | NA | NA | 10x Genomics | IG | TRUE |
| bulk-Laserson-2014.fasta | human | PGP1 | PGP1 | PBMC | male | NA | Laserson-2014 | IG | FALSE |
| filename | species | subject_id | sample_id | tissue | sex | age | biomaterial_provider | pcr_target_locus | single_cell |
| ------------------------------------------- | ------- | ---------- | ---------------------- | ------ | ---- | --- | -------------------- | ---------------- | ----------- |
| sc5p_v2_hs_PBMC_1k_b_airr_rearrangement.tsv | human | subject_x | sc5p_v2_hs_PBMC_1k_5fb | PBMC | NA | NA | 10x Genomics | IG | TRUE |
| bulk-Laserson-2014.fasta | human | PGP1 | PGP1 | PBMC | male | NA | Laserson-2014 | IG | FALSE |

### Supported AIRR metadata fields

Expand Down
6 changes: 3 additions & 3 deletions modules/local/airrflow_report/airrflow_report.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ process AIRRFLOW_REPORT {
tag "${meta.id}"
label 'process_high'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
tuple val(meta), path(tab) // sequence tsv table in AIRR format
Expand Down
4 changes: 2 additions & 2 deletions modules/local/changeo/changeo_parsedb_select.nf
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ process CHANGEO_PARSEDB_SELECT {
script:
def args = task.ext.args ?: ''
def args2 = task.ext.args2 ?: ''
if (meta.locus == 'IG'){
if (meta.locus.toUpperCase() == 'IG'){
"""
ParseDb.py select -d $tab $args --outname ${meta.id} > ${meta.id}_select_command_log.txt
Expand All @@ -30,7 +30,7 @@ process CHANGEO_PARSEDB_SELECT {
changeo: \$( ParseDb.py --version | awk -F' ' '{print \$2}' )
END_VERSIONS
"""
} else if (meta.locus == 'TR'){
} else if (meta.locus.toUpperCase() == 'TR'){
"""
ParseDb.py select -d $tab $args2 --outname ${meta.id} > "${meta.id}_command_log.txt"
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/collapse_duplicates.nf
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ process COLLAPSE_DUPLICATES {
label 'process_long_parallelized'
label 'immcantation'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
tuple val(meta), path(tabs) // tuple [val(meta), sequence tsv in AIRR format ]
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/define_clones.nf
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ process DEFINE_CLONES {
label 'process_long_parallelized'
label 'immcantation'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
tuple val(meta), path(tabs) // meta, sequence tsv in AIRR format
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/detect_contamination.nf
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ process DETECT_CONTAMINATION {
label 'immcantation'


conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
path(tabs)
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/dowser_lineages.nf
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ process DOWSER_LINEAGES {
label 'process_long_parallelized'
label 'immcantation'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
tuple val(meta), path(tabs)
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/find_threshold.nf
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ process FIND_THRESHOLD {
label 'process_long_parallelized'
label 'immcantation'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"


input:
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/remove_chimeric.nf
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ process REMOVE_CHIMERIC {
label 'immcantation'


conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"


input:
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/report_file_size.nf
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ process REPORT_FILE_SIZE {
label 'immcantation'
label 'process_single'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
path logs
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/single_cell_qc.nf
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ process SINGLE_CELL_QC {
label 'immcantation'
label 'process_medium'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
path(tabs)
Expand Down
6 changes: 3 additions & 3 deletions modules/local/enchantr/validate_input.nf
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ process VALIDATE_INPUT {
label 'immcantation'
label 'process_single'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
file samplesheet
Expand Down
6 changes: 3 additions & 3 deletions modules/local/reveal/add_meta_to_tab.nf
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ process ADD_META_TO_TAB {
label 'immcantation'
label 'process_single'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

cache 'deep' // Without 'deep' this process would run when using -resume

Expand Down
6 changes: 3 additions & 3 deletions modules/local/reveal/filter_junction_mod3.nf
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ process FILTER_JUNCTION_MOD3 {
label 'immcantation'
label 'process_single'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
tuple val(meta), path(tab) // sequence tsv in AIRR format
Expand Down
6 changes: 3 additions & 3 deletions modules/local/reveal/filter_quality.nf
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ process FILTER_QUALITY {
label 'immcantation'
label 'process_single'

conda "bioconda::r-enchantr=0.1.2"
conda "bioconda::r-enchantr=0.1.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.2--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.2--r42hdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/r-enchantr:0.1.3--r42hdfd78af_0':
'biocontainers/r-enchantr:0.1.3--r42hdfd78af_0' }"

input:
tuple val(meta), path(tab) // sequence tsv in AIRR format
Expand Down

0 comments on commit 4adcc54

Please sign in to comment.