Pipeline creates fastq.gz and iridanext json files for some samples even if SRA files are not available for other samples #10

mcook19 · 2024-02-12T14:13:34Z

Description of feature

If there are samples with correctly formatted insdc_accession in the input submission sheet, but the SRA files do not exist or are not publicly available then no read or iridanext json files are created in the specified output directory for samples with available SRA files. The pipeline exits. Those samples that have no SRA data available must be removed from the sample sheet for the pipeline to run through and create the read and json files. For larger imports, it could be helpful to more clearly identify the samples that did not have any SRA data available and potentially allow the pipeline to run through for the samples that did have the data available.

emarinier · 2024-04-02T14:57:22Z

It looks like retry_with_backoff.sh (an nf-core script that's run inside nf-core's sratools/prefetch/main.nf) fails to download the files because of a 403 (forbidden) error. Then nextflow errors because it was expecting output from nf-core's sratools/prefetch/main.nf, but there is no output.

Some solutions might be:

error handling on nextflow's side
customize sratools/prefetch/main.nf to not fail in such a case

I think considering other errors might happen (loss of internet connection, etc.), we might want to try catching and handling these errors on the nextflow side of things, while leaving sratools/prefetch/main.nf unchanged if possible.

Here's some output from when it fails:

rm -rf results/; nextflow run phac-nml/fetchdatairidanext -profile docker --input errorsheet.csv --outdir results

errorsheet.csv
fetchdatairidanext-403.txt

mcook19 added the enhancement New feature or request label Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline creates fastq.gz and iridanext json files for some samples even if SRA files are not available for other samples #10

Pipeline creates fastq.gz and iridanext json files for some samples even if SRA files are not available for other samples #10

mcook19 commented Feb 12, 2024

emarinier commented Apr 2, 2024

Pipeline creates fastq.gz and iridanext json files for some samples even if SRA files are not available for other samples #10

Pipeline creates fastq.gz and iridanext json files for some samples even if SRA files are not available for other samples #10

Comments

mcook19 commented Feb 12, 2024

Description of feature

emarinier commented Apr 2, 2024