-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Add feature for sample demultiplexing followed by immune profiling #365
base: dev
Are you sure you want to change the base?
Conversation
…ng. additionally changing input channels
…by immune profiling
… cellranger multi or cellranger multi+vdj
…r to bamtofastq process
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made several changes to this script, but idk if I should make a PR for the module itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes you made seem pretty specific to your use-case, so it doesn't make sense to update the central module.
It is also not allows to change modules in a pipeline (-> linting error).
The preferred way would be to make it somehow work with the nf-core module unchanged. If this is not possible, you can make a copy of the module in the "local" folder and adapt it as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grst now the changes are minimal to this module. The rest I could fix by manipulating the output channel. Should I make a PR for bamtofastq or should I still just move this to "local" dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I make a PR for bamtofastq
Yes please, the changes look like everyone will benefit from them.
bamtofastq \\ | ||
$args \\ | ||
$bam \\ | ||
$prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the output path from ${prefix}.fastq.gz
.
bamtofastq generates a directory containing two folders: one for GEX and one for CMO .fastq
files.
The two folders are prefixed with the .bam
prefix.
All files are automatically prefixed with bamtofastq
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes you made seem pretty specific to your use-case, so it doesn't make sense to update the central module.
It is also not allows to change modules in a pipeline (-> linting error).
The preferred way would be to make it somehow work with the nf-core module unchanged. If this is not possible, you can make a copy of the module in the "local" folder and adapt it as needed.
include { CELLRANGER_MKVDJREF } from "../../modules/nf-core/cellranger/mkvdjref/main.nf" | ||
|
||
// Define workflow to subset and index a genome region fasta file | ||
workflow CELLRANGER_MULTI_REF { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure the workflow name and filename match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you accept the new name of the file?
cellrangermulti_ref.nf
|
How nextflow stages the files you can check by investigating the process work directory. I haven't worked much wich cellranger multi, but cellranger is pretty strict about the filenames. They need to follow the |
I realized the workflow renamed the files according to the GEX sample name, so I had to ensure that the IDs of the VDJ and AB channels were consistent with the demultiplexed GEX IDs. |
I have not tested the pipeline with frna data. Further, I had to exclude the sample containing probe barcodes from my test set, ie: When I tried including another dataset which had been hashed with the same cmo as one of the others I ran into this error: When running the pipeline separately on the dataset which failed, I had no issues. I have not spent more time trying to work around it, because I don't expect it to be an issue in our use case and, probably, it is a rare event - but thought you'd like to know. samplesheet.csv |
Description of changes
The current nf-core/scrnaseq version (v1.7.0) does not handle the use case where data is both sample multiplexed and requires immune profiling. The current version can however handle either situation separately.
The cellranger software provided by 10x does not handle this situation either, but this tutorial guides the user to follow 3 steps:
.bam
files to.fq
files.fq
filesThis PR serves to enable that for the workflow. This has required an additional tool (
nf-core/bamtofastq10x
), some new code, and a bit of rearrangement of existing code.Rearrangement of code
Since cellranger multi is to be run multiple times during a nextflow run, I moved the code for preparing the reference genome from
subworkflows/local/align_cellrangermulti.nf
and generated a new subworkflow,cellrangermulti_ref.nf
.Added tool
I added the tool
nf-core/bamtofastq10x
.New code
I added the subworkflow
align_cellrangermulti_vdj.nf
based onalign_cellrangermulti.nf
which contains the above described steps: 1,2, and 3. For step 1 and 3 the nf-core cellranger multi module is invoked as inalign_cellrangermulti.nf
.The main changes lie in channel operations in
scrnaseq.nf
andalign_cellrangermulti_vdj.nf
.PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).