diff --git a/docs/usage.md b/docs/usage.md index 69b7696b..c854811c 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -42,13 +42,13 @@ nextflow run nf-core/airrflow \ A typical command to run the pipeline from **single cell raw fastq files** is: ```bash -nextflow run nf-core/airrflow -r dev \ +nextflow run nf-core/airrflow \ -profile \ --mode fastq \ --input input_samplesheet.tsv \ --library_generation_method sc_10x_genomics \ --reference_10x reference/refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0.tar.gz \ ---outdir ./results +--outdir results ``` A typical command for running the pipeline departing from **single-cell AIRR rearrangement tables or assembled bulk sequencing fasta** data is: @@ -123,7 +123,7 @@ If you wish to share such profile (such as upload as supplementary material for ## Input samplesheet -### Fastq input samplesheet (bulk sequencing) +### Fastq input samplesheet (bulk AIRR and bulk/sc RNA sequencing) The required input file for processing raw BCR or TCR bulk targeted sequencing data is a sample sheet in TSV format (tab separated). The columns `sample_id`, `filename_R1`, `filename_R2`, `subject_id`, `species`, `tissue`, `pcr_target_locus`, `single_cell`, `sex`, `age` and `biomaterial_provider` are required. An example samplesheet is: @@ -511,6 +511,50 @@ nextflow run nf-core/airrflow -r dev \ - The 10xGenomics reference can be downloaded from the [download page](https://www.10xgenomics.com/support/software/cell-ranger/downloads) - To generate a V(D)J segment fasta file as reference from IMGT one can follow the [cellranger docs](https://support.10xgenomics.com/single-cell-vdj/software/pipelines/latest/advanced/references#imgt). + +## Supported unselected RNA-seq based methods + +nf-core/airrflow supports unselected bulk or single-cell RNA-seq fastq files as input. [TRUST4](https://github.com/liulab-dfci/TRUST4) is used to extract TCR/BCR sequences from these files. The resulting AIRR tables are then fed into airrflow's Immcantation based workflow.
+To use unselected RNA-seq based input, specify `--library_generation_method trust4`. + +### Bulk RNA-seq + +A typical command to run the pipeline from **bulk RNA-seq fastq files** is: + +```bash +nextflow run nf-core/airrfow \ +-profile \ +--mode fastq \ +--input input_samplesheet.tsv \ +--library_generation_method trust4 \ +--coord_fasta reference/IMGT+C.fa \ +--outdir results +``` + +### Single-cell RNA-seq + +A typical command to run the pipeline from **single-cell RNA-seq fastq files** is: + +```bash +nextflow run nf-core/airrfow \ +-profile \ +--mode fastq \ +--input input_samplesheet.tsv \ +--library_generation_method trust4 \ +--umi_position R1 \ +--read_format bc:0:15,um:16:27 +--coord_fasta reference/IMGT+C.fa \ +--outdir results +``` + +* If UMI's are present, the read containing them must be specified using the `--umi_position` parameter. +* The `--read_format` parameter can be used to specify the Barcode and UMI position within the reads (see TRUST4 [docs](https://github.com/liulab-dfci/TRUST4?tab=readme-ov-file#10x-genomics-data-and-barcode-based-single-cell-data)) + +#### Reference file + +TRUST4 requires a reference. This can provided using the `--coord_fasta` parameter. +The reference fasta can be downloaded from IMGT and created using [TRUST4](https://github.com/liulab-dfci/TRUST4?tab=readme-ov-file#build-custom-vjc-gene-database-files-for--f-and---ref) + ## Core Nextflow arguments :::note diff --git a/nextflow_schema.json b/nextflow_schema.json index 776dfdf0..072a532e 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -487,9 +487,9 @@ "fa_icon": "fab fa-pagelines" }, "rnaseq_based_analysis_options": { - "title": "RNA seq based analysis options", + "title": "Unselected RNA-seq based analysis options", "type": "object", - "description": "Options specific for raw RNA seq input.", + "description": "Options specific for raw unselected RNA-seq input.", "default": "", "properties": { "barcode_read": {