Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BBDuk: fails on auto-uncompressed fastq with spaces in filenames. #6495

Open
hexylena opened this issue Oct 28, 2024 · 1 comment
Open

BBDuk: fails on auto-uncompressed fastq with spaces in filenames. #6495

hexylena opened this issue Oct 28, 2024 · 1 comment

Comments

@hexylena
Copy link
Member

I provided a paired collection to bbduk which failed with the following error:

            java -ea -Xmx30303m -Xms30303m -cp /srv/galaxy/var/dependencies/_conda/envs/mulled-v1-ace992d4e029847e92000d257e16b237d31f99821592d4d6dbf742d389021c0f/opt/bbmap-39.01-1/current/ jgi.BBDuk in=SRX6855211_SRR10127028_1_fastq uncompressedSRX6855211_SRR10127028_1_fastq uncompressed.fastq in2=SRX6855211_SRR10127028_2_fastq uncompressedSRX6855211_SRR10127028_1_fastq uncompressed.fastq out=/data/galaxy/jobs/020/20180/outputs/dataset_32e47ed3-52bc-4a7d-8cea-e83aa26af423.dat out2=/data/galaxy/jobs/020/20180/outputs/dataset_056b24b6-3e37-4a97-bf79-eb4bf2b92b08.dat outm=/data/galaxy/jobs/020/20180/outputs/dataset_db56ccc6-9174-48e4-9d79-7ab5c7dc493c.dat outm2=/data/galaxy/jobs/020/20180/outputs/dataset_afeb15c0-401b-40f7-a4cc-78fb62885729.dat outs=/data/galaxy/jobs/020/20180/outputs/dataset_0b32aa02-3724-4537-b5b1-ac126faf5f53.dat k=27 rcomp=t maskmiddle=t minkmerhits=1 minkmerfraction=0.0 mincovfraction=0.0 hammingdistance=0 qhdist=0 editdistance=0 forbidn=f trimfailures=f findbestmatch=f skipr1=f skipr2=f t=4
Executing jgi.BBDuk [in=SRX6855211_SRR10127028_1_fastq, uncompressedSRX6855211_SRR10127028_1_fastq, uncompressed.fastq, in2=SRX6855211_SRR10127028_2_fastq, uncompressedSRX6855211_SRR10127028_1_fastq, uncompressed.fastq, out=/data/galaxy/jobs/020/20180/outputs/dataset_32e47ed3-52bc-4a7d-8cea-e83aa26af423.dat, out2=/data/galaxy/jobs/020/20180/outputs/dataset_056b24b6-3e37-4a97-bf79-eb4bf2b92b08.dat, outm=/data/galaxy/jobs/020/20180/outputs/dataset_db56ccc6-9174-48e4-9d79-7ab5c7dc493c.dat, outm2=/data/galaxy/jobs/020/20180/outputs/dataset_afeb15c0-401b-40f7-a4cc-78fb62885729.dat, outs=/data/galaxy/jobs/020/20180/outputs/dataset_0b32aa02-3724-4537-b5b1-ac126faf5f53.dat, k=27, rcomp=t, maskmiddle=t, minkmerhits=1, minkmerfraction=0.0, mincovfraction=0.0, hammingdistance=0, qhdist=0, editdistance=0, forbidn=f, trimfailures=f, findbestmatch=f, skipr1=f, skipr2=f, t=4]
Version 39.01

Exception in thread "main" java.lang.RuntimeException: Unknown parameter uncompressedSRX6855211_SRR10127028_1_fastq
	at jgi.BBDuk.<init>(BBDuk.java:538)
	at jgi.BBDuk.main(BBDuk.java:78)

The dataset names have spaces in them:

ln -s '/data/galaxy/f/e/1/dataset_fe1245a6-287d-4732-af60-37021f7eaab1.dat' 'SRX6855211_SRR10127028_1_fastq uncompressedSRX6855211_SRR10127028_1_fastq uncompressed.fastq' && ln -s '/data/galaxy/5/6/e/dataset_56eedb18-918a-4c26-a5db-f3504dd763c2.dat' 'SRX6855211_SRR10127028_2_fastq uncompressedSRX6855211_SRR10127028_1_fastq uncompressed.fastq' &&   bbduk.sh in='SRX6855211_SRR10127028_1_fastq uncompressedSRX6855211_SRR10127028_1_fastq uncompressed.fastq'  in2='SRX6855211_SRR10127028_2_fastq uncompressedSRX6855211_SRR10127028_1_fastq uncompressed.fastq' out='/data/galaxy/jobs/020/20180/outputs/dataset_32e47ed3-52bc-4a7d-8cea-e83aa26af423.dat' out2='/data/galaxy/jobs/020/20180/outputs/dataset_056b24b6-3e37-4a97-bf79-eb4bf2b92b08.dat' outm='/data/galaxy/jobs/020/20180/outputs/dataset_db56ccc6-9174-48e4-9d79-7ab5c7dc493c.dat' outm2='/data/galaxy/jobs/020/20180/outputs/dataset_afeb15c0-401b-40f7-a4cc-78fb62885729.dat' outs='/data/galaxy/jobs/020/20180/outputs/dataset_0b32aa02-3724-4537-b5b1-ac126faf5f53.dat'   k=27 rcomp='t' maskmiddle='t' minkmerhits='1' minkmerfraction=0.0 mincovfraction=0.0 hammingdistance=0 qhdist=0 editdistance=0 forbidn='f' trimfailures='f' findbestmatch='f' skipr1='f' skipr2='f'  t=${GALAXY_SLOTS:-4}

which I strongly suspect is at play here, given the line from the log file:

 [in=SRX6855211_SRR10127028_1_fastq, uncompressedSRX6855211_SRR10127028_1_fastq, uncompressed.fastq, in2=SRX6855211_SRR10127028_2_fastq, uncompressedSRX6855211_SRR10127028_1_fastq, uncompressed.fastq,

which somewhat suggests they're being passed as multiple arguments incorrectly.

@bernt-matthias
Copy link
Contributor

Seems that you are not using the latest version. bbduk uses hardcoded symlink names since a while. https://github.com/galaxyproject/tools-iuc/pull/4329/files

Am I wrong?

Wondering why (auto)uncompressed files are used? It seems the the tool should accept zipped files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants