Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline creates fastq.gz and iridanext json files for some samples even if SRA files are not available for other samples #10

Open
mcook19 opened this issue Feb 12, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@mcook19
Copy link

mcook19 commented Feb 12, 2024

Description of feature

If there are samples with correctly formatted insdc_accession in the input submission sheet, but the SRA files do not exist or are not publicly available then no read or iridanext json files are created in the specified output directory for samples with available SRA files. The pipeline exits. Those samples that have no SRA data available must be removed from the sample sheet for the pipeline to run through and create the read and json files. For larger imports, it could be helpful to more clearly identify the samples that did not have any SRA data available and potentially allow the pipeline to run through for the samples that did have the data available.

@mcook19 mcook19 added the enhancement New feature or request label Feb 12, 2024
@emarinier
Copy link
Member

It looks like retry_with_backoff.sh (an nf-core script that's run inside nf-core's sratools/prefetch/main.nf) fails to download the files because of a 403 (forbidden) error. Then nextflow errors because it was expecting output from nf-core's sratools/prefetch/main.nf, but there is no output.

Some solutions might be:

  • error handling on nextflow's side
  • customize sratools/prefetch/main.nf to not fail in such a case

I think considering other errors might happen (loss of internet connection, etc.), we might want to try catching and handling these errors on the nextflow side of things, while leaving sratools/prefetch/main.nf unchanged if possible.

Here's some output from when it fails:

rm -rf results/; nextflow run phac-nml/fetchdatairidanext -profile docker --input errorsheet.csv --outdir results

errorsheet.csv
fetchdatairidanext-403.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants