Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fastq files missing from output / report, 10x single cell #135

Closed
rifius opened this issue Aug 1, 2023 · 3 comments
Closed

Fastq files missing from output / report, 10x single cell #135

rifius opened this issue Aug 1, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@rifius
Copy link

rifius commented Aug 1, 2023

Description of the bug

I run the bclconvert demux on a 10x single cell and some fastq files are not copied / linked to the output folder, their md5sums not computed and falco, fastp processes on them are not launched. In consequence, they are also missing from the QC reports.

Setup:

  • Single Flow Cell run, using all four lanes.
  • 31 samples, each with four i7 indices from the 10x Chromium Single Cell v3.1 Single Index Kit T Set A, that is 124 demultiplex indices in total.
  • An fragment of the BCL Convert sample sheet (bclcvt-sampleindex.csv):
[Header]
FileFormatVersion,2

[BCLConvert_Settings]
CreateFastqForIndexReads,0

[BCLConvert_Data]
Sample_ID,index
CAM17_RoGr,ACGTCCCT
CAM17_RoGr,CGCATGTG
CAM17_RoGr,GAAGGAAC
CAM17_RoGr,TTTCATGA
CAM22_JoMc,AACCGTAA
CAM22_JoMc,CTAAACGG
CAM22_JoMc,GGTTTACT
CAM22_JoMc,TCGGCGTC
CAM27_ElBr,AACGTCAA
....
  • Nextflow demultiplex sample sheet:
id,samplesheet,lane,flowcell
A00999,/full/path/to/bclcvt-sampleindex.csv,,/full/bcl/data/path

As per the docs, when lane is not given, all lanes will be processed.

Results

Nextflow run completes with success, no errors listed. However, only 21 out of the 31 samples are linked in the output dir and listed in the MultiQC report.

With -dump-channels, the output of the BCLCONVERT module tagged as DEMULTIPLEX::Demultiplexed Fastq contains 84 items (that is: 21 samples by 4 lanes).
The working folder of BCLCONVERT task contains all expected 256 .fastq.gz files (that is: 31 samples x 4 lanes x R1/R2 + 8 Undetermined files: 4 lanes x R1/R2), which means demultiplexing ran Ok.

I can't figure out what could be causing this behaviour, or how to quickly troubleshoot. I will manually add the 10 missing sample links to the output folder and continue with my downstream analysis, but reporting of this stage is incomplete until this is solved.

On different runs, it is always the same samples that are missing (for instance, sample CAM17_RoGr above is always missing from output).

Command used and terminal output

$ nextflow run nf-core/demultiplex --input nf-samplesheet.csv --outdir DMUX --demultiplexer bclconvert --trim_fastq false -bg -profile podman -dump-channels

(also tried with -resume)

Cleaned .nextflow.log output included below.

Relevant files

nf.log.gz

System information

Version: 23.04.1 build 5866
Created: 15-04-2023 06:51 UTC (16:51 AEDT)
System: Linux 6.3.12-200.fc38.x86_64
Runtime: Groovy 3.0.16 on OpenJDK 64-Bit Server VM 17.0.6+10
Encoding: UTF-8 (UTF-8)
Process: 2401714@my-machine [10.x.x.x]
CPUs: 32 - Mem: 503.3 GB (6.2 GB) - Swap: 0 (0)

nf-core/demultiplex v1.3.2-g67b8465

Container engine: podman rootless
OS: Fedora Core OS

@rifius rifius added the bug Something isn't working label Aug 1, 2023
@edmundmiller
Copy link
Collaborator

@matthdsm any thoughts? I'm wondering if it's the publishing and the naming of the sample with an _ that's the issue.

@matthdsm
Copy link
Collaborator

I think it's the sample naming that's the issue here. De demux modules glob on **[!Undetermined]_S*_R?_00?.fastq.gz to find the output fastq's and the names with _R* cause some kind of collision.

@rifius, could you post an ls of the bclconvert work dir so we can check out the filenames?

https://github.com/nf-core/modules/blob/97b7dc798a002688b6304a453da932b2144727b1/modules/nf-core/bclconvert/main.nf#L11

@apeltzer
Copy link
Member

apeltzer commented Aug 1, 2024

Other option now available: 10X mkfastq is now available on dev and soon in 1.5.0 too

@grst grst closed this as completed Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants