Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Add feature for sample demultiplexing followed by immune profiling #365

Draft
wants to merge 22 commits into
base: dev
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
5812f72
cp cellrangermulti to cellrangermulti_vdj in SCRNASEQ workflow
Jul 23, 2024
a7a5cd3
cp align_cellrangermulti to align_cellrangermulti_vdj in local subwor…
Jul 23, 2024
06e9932
separate processes that generate reference files from cellranger multi
Jul 30, 2024
9bafa8d
installed nf-core/module bamtofastq10x
Aug 13, 2024
2e32c2e
add null/ to the gitignore list
Aug 15, 2024
6072a4c
add description of demultiplexing combined with immuneprofiling
Aug 15, 2024
27a8c16
add bamtofastq10x module with amendments
Aug 15, 2024
a1d0170
move reference creation outside the cellranger multi to avoid rerunni…
Aug 15, 2024
170ec07
add subworkflow specific for handling sample demultiplexing followed …
Aug 15, 2024
3b71ff8
implement cellranger multi ref and vdj. branch channels to either run…
Aug 15, 2024
0132d9b
update publishDir for the two cellranger multi outputs. add publishDi…
Aug 15, 2024
1434e45
add func to expand feature channels to match demultiplexed gex. modif…
Aug 21, 2024
18b9eae
remove renaming of files
Aug 21, 2024
ecf453f
removed arg for BAMTOFASTQ and updated CELLRANGER_MULTI with regex
Sep 3, 2024
8359254
changed faux channels to value channels to be consumed infinitely and…
Sep 3, 2024
c4356b0
remove unused code
Sep 3, 2024
ced5ddc
renamed file
Sep 3, 2024
8c6e1c0
update output path for fastq
Sep 4, 2024
e22f8fb
update output dir for emptydrops analysis
Sep 4, 2024
eff6158
update filename for generating reference files for cellranger multi
Sep 4, 2024
998f967
remove frna option for immune-profiling
Sep 9, 2024
cef8759
updated bamtofastq10x module
Oct 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ reports/
testme.sh
.nf-test*
.vscode
null/
9 changes: 8 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ process {
mode: params.publish_dir_mode,
saveAs: { filename ->
if ( params.aligner == 'cellranger' ) "count/${meta.id}/${filename}"
else if ( params.aligner == 'cellrangermulti' ) "emptydrops/${meta.id}/${filename}"
else if ( params.aligner == 'kallisto' ) "${meta.id}.count/${filename}"
else "${meta.id}/${filename}"
}
Expand Down Expand Up @@ -218,7 +219,7 @@ if (params.aligner == 'kallisto') {
if (params.aligner == 'cellrangermulti') {
process {
withName: FASTQC { ext.prefix = { "${meta.id}_${meta.feature_type}" } } // allow distinguishment of data types after renaming
withName: 'NFCORE_SCRNASEQ:SCRNASEQ:CELLRANGER_MULTI_ALIGN:CELLRANGER_MULTI' {
withName: 'NFCORE_SCRNASEQ:SCRNASEQ:CELLRANGER_MULTI_ALIGN(_VDJ)?:CELLRANGER_MULTI.*' {
ext.prefix = null // force it null, for some reason it was being wrongly read in the module
publishDir = [
path: "${params.outdir}/${params.aligner}/count",
Expand Down Expand Up @@ -250,5 +251,11 @@ if (params.aligner == 'cellrangermulti') {
mode: params.publish_dir_mode
]
}
withName: BAMTOFASTQ10X {
publishDir = [
path: "${params.outdir}/${params.aligner}/bam2fastq",
mode: params.publish_dir_mode
]
}
}
}
2 changes: 2 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,8 @@ for the corresponding documentation.

- Overall same output structure as cellranger. In case of multiplexed samples there will be one ouput folder for
each demultiplexed sample, and one containing all (non-demultiplexed) cells.
- In case sample demultiplexing is to be followed by immune profiling, an extra output is added containing .fastq files
converted from the standard .bam file output.

## UniverSC

Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@
"https://github.com/nf-core/modules.git": {
"modules": {
"nf-core": {
"bamtofastq10x": {
"branch": "master",
"git_sha": "63d6994f4f85c0628b7f2ac1e7097136c1b4be34",
"installed_by": ["modules"]
},
"cellranger/count": {
"branch": "master",
"git_sha": "90dad5491658049282ceb287a3d7732c1ce39837",
Expand Down
9 changes: 9 additions & 0 deletions modules/nf-core/bamtofastq10x/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

46 changes: 46 additions & 0 deletions modules/nf-core/bamtofastq10x/main.nf
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made several changes to this script, but idk if I should make a PR for the module itself?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes you made seem pretty specific to your use-case, so it doesn't make sense to update the central module.
It is also not allows to change modules in a pipeline (-> linting error).

The preferred way would be to make it somehow work with the nf-core module unchanged. If this is not possible, you can make a copy of the module in the "local" folder and adapt it as needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grst now the changes are minimal to this module. The rest I could fix by manipulating the output channel. Should I make a PR for bamtofastq or should I still just move this to "local" dir?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I make a PR for bamtofastq

Yes please, the changes look like everyone will benefit from them.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

48 changes: 48 additions & 0 deletions modules/nf-core/bamtofastq10x/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

55 changes: 55 additions & 0 deletions modules/nf-core/bamtofastq10x/tests/main.nf.test

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

47 changes: 47 additions & 0 deletions modules/nf-core/bamtofastq10x/tests/main.nf.test.snap

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions modules/nf-core/bamtofastq10x/tests/tags.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

61 changes: 2 additions & 59 deletions subworkflows/local/align_cellrangermulti.nf
Original file line number Diff line number Diff line change
@@ -1,20 +1,15 @@
//
// Include modules
//
include { CELLRANGER_MKGTF } from "../../modules/nf-core/cellranger/mkgtf/main.nf"
include { CELLRANGER_MKREF } from "../../modules/nf-core/cellranger/mkref/main.nf"
include { CELLRANGER_MKVDJREF } from "../../modules/nf-core/cellranger/mkvdjref/main.nf"
include { CELLRANGER_MULTI } from "../../modules/nf-core/cellranger/multi/main.nf"
include { PARSE_CELLRANGERMULTI_SAMPLESHEET } from "../../modules/local/parse_cellrangermulti_samplesheet.nf"

// Define workflow to subset and index a genome region fasta file
workflow CELLRANGER_MULTI_ALIGN {
take:
ch_fasta
ch_gtf
ch_fastq
cellranger_gex_index
cellranger_vdj_index
ch_cellranger_gex_index
ch_cellranger_vdj_index
ch_multi_samplesheet

main:
Expand Down Expand Up @@ -109,58 +104,6 @@ workflow CELLRANGER_MULTI_ALIGN {
ch_frna_sample_csv = []
}

//
// Prepare GTF
//
if ( !cellranger_gex_index || (!cellranger_vdj_index && !params.skip_cellrangermulti_vdjref) ) {

// Filter GTF based on gene biotypes passed in params.modules
CELLRANGER_MKGTF ( ch_gtf )
ch_versions = ch_versions.mix(CELLRANGER_MKGTF.out.versions)

}

//
// Prepare gex reference (Normal Ref)
//
if ( !cellranger_gex_index ) {

// Make reference genome
CELLRANGER_MKREF(
ch_fasta,
CELLRANGER_MKGTF.out.gtf,
"gex_reference"
)
ch_versions = ch_versions.mix(CELLRANGER_MKREF.out.versions)
ch_cellranger_gex_index = CELLRANGER_MKREF.out.reference.ifEmpty { [] }

} else {
ch_cellranger_gex_index = cellranger_gex_index
}

//
// Prepare vdj reference (Special)
//
if ( !cellranger_vdj_index ) {

if ( !params.skip_cellrangermulti_vdjref ) { // if user uses cellranger multi but does not have VDJ data
// Make reference genome
CELLRANGER_MKVDJREF(
ch_fasta,
CELLRANGER_MKGTF.out.gtf,
[], // currently ignoring the 'seqs' option
"vdj_reference"
)
ch_versions = ch_versions.mix(CELLRANGER_MKVDJREF.out.versions)
ch_cellranger_vdj_index = CELLRANGER_MKVDJREF.out.reference.ifEmpty { [] }
} else {
ch_cellranger_vdj_index = []
}

} else {
ch_cellranger_vdj_index = cellranger_vdj_index
}

//
// MODULE: cellranger multi
//
Expand Down
Loading
Loading