Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too few arguments for '--mm2-opts' #1048

Open
kir1to455 opened this issue Sep 29, 2024 · 5 comments
Open

Too few arguments for '--mm2-opts' #1048

kir1to455 opened this issue Sep 29, 2024 · 5 comments
Labels
bug Something isn't working resume Issues with --resume-from

Comments

@kir1to455
Copy link

Issue Report

Please describe the issue:

When I running dorado basecaller , I encountered the following error.
image

Steps to reproduce the issue:

My dorado basecaller code:
${DoradoDir}/dorado basecaller -v sup,inosine_m6A,pseU,m5C --min-qscore 10 --verbose --emit-moves -b 64 --chunksize 9216 --mm2-opts "-k 15 -w 10 --secondary=no" --estimate-poly-a --reference ${indexDir}/gencode.v43.normal.transcripts.fa -x cuda:0 ${pod5path}/HEK293T_1.pod5 --resume-from ${ModDir}/PAQ17395_1_sup.pass.m6A_pseU_m5C_inosine.mod.pass.bam > ${ModDir}/PAQ17395_1_sup.pass.m6A_pseU_m5C_inosine.mod.pass.complete.bam

Run environment:

  • Dorado version: V0.8.0
  • Operating system: Linux
  • Hardware (CPUs, Memory, GPUs): GTX 3090Ti
  • Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance): pod5
  • Source data location (on device or networked drive - NFS, etc.): on device
  • Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB): SQK-RNA004

Logs

[2024-09-29 10:34:28.139] [info] - BAM format does not support U, so RNA output files will include T instead of U for all file types.
[2024-09-29 10:34:32.453] [debug] TxEncoderStack: use_koi_tiled false.
[2024-09-29 10:34:33.304] [debug] cuda:0 memory available: 16.97GB
[2024-09-29 10:34:33.304] [debug] cuda:0 memory limit 12.65GB
[2024-09-29 10:34:33.304] [debug] cuda:0 maximum safe estimated batch size at chunk size 9216 is 192
[2024-09-29 10:34:33.304] [debug] cuda:0 maximum safe estimated batch size at chunk size 4608 is 416
[2024-09-29 10:34:33.304] [info] cuda:0 using chunk size 9216, batch size 64
[2024-09-29 10:34:33.304] [debug] cuda:0 Model memory 3.43GB
[2024-09-29 10:34:33.304] [debug] cuda:0 Decode memory 0.42GB
[2024-09-29 10:34:33.510] [info] cuda:0 using chunk size 4608, batch size 64
[2024-09-29 10:34:33.510] [debug] cuda:0 Model memory 1.71GB
[2024-09-29 10:34:33.510] [debug] cuda:0 Decode memory 0.21GB
[2024-09-29 10:34:59.199] [debug] Loaded index with 252913 target seqs
[2024-09-29 10:34:59.630] [debug] BasecallerNode chunk size 9216
[2024-09-29 10:34:59.630] [debug] BasecallerNode chunk size 4608
[2024-09-29 10:35:00.269] [info] > Inspecting resume file...
[2024-09-29 10:35:00.603] [error] finalise() not called on a HtsFile.
[2024-09-29 10:35:00.651] [error] Too few arguments for '--mm2-opts'.
[2024-09-29 10:35:00.651] [trace] Deleting temporary model path: /public2/hjliang/ONT_data/script/.temp_dorado_model-4dca3eb0a456c074
[2024-09-29 10:35:00.673] [trace] Deleting temporary model path: /public2/hjliang/ONT_data/script/.temp_dorado_model-5e2608ef88664109

Best wishes,
Kirito

@kir1to455
Copy link
Author

Addendum: If I don't add the --resume-from parameter, dorado basecaller will run normally.
Dorado will not output the '--mm2-opts' error.

@github-staff github-staff deleted a comment Sep 29, 2024
@malton-ont malton-ont added bug Something isn't working resume Issues with --resume-from labels Sep 30, 2024
@HalfPhoton
Copy link
Collaborator

Can you share the bam header in the --resume-from ${ModDir}/PAQ17395_1_sup.pass.m6A_pseU_m5C_inosine.mod.pass.bam file?

Dorado stores the original command used here and the error may be there.

Best regards,
Rich

@kir1to455
Copy link
Author

kir1to455 commented Sep 30, 2024

Hi, @HalfPhoton
I put the header in here.

@HD     VN:1.6  SO:unknown
@PG     ID:basecaller   PN:dorado       VN:0.8.0+acec121        
CL:dorado basecaller -v sup,inosine_m6A,pseU,m5C --min-qscore 10 --verbose --emit-moves
 -b 64 --chunksize 9216 --mm2-opts -k 15 -w 10 --secondary=no --estimate-poly-a --reference /home/hjliang/genomes/hg38/gencode.v43.normal.transcripts.f
a -x cuda:0 /public3/guowb/RNA004/WT/pod5/HEK293T_1.pod5  
DS:gpu:NVIDIA GeForce RTX 3090
@PG     ID:samtools     PN:samtools     PP:basecaller   VN:1.17 CL:samtools view -H PAQ17395_1_sup.pass.m6A_pseU_m5C_inosine.mod.pass.bam
@RG     ID:68b19cb40ce8cb6e3e194899954bdb9e5586ceba_rna004_130bps_sup@v5.1.0    PU:PAQ17395     PM:PC48A044     DT:2023-11-17T08:27:39.852+00:00
        PL:ONT  DS:[email protected] [email protected]_inosine_m6A@v1,[email protected]_pseU@v1,rna0
[email protected]_m5C@v1 runid=68b19cb40ce8cb6e3e194899954bdb9e5586ceba       LB:20231117-NPL2300672-P6-PAQ17395-hac  SM:20231117-NPL2300672-P6-PAQ1
7395-hac
@SQ     SN:ENST00000456328.2|ENSG00000290825.1  LN:1657
@SQ     SN:ENST00000450305.2|ENSG00000223972.6  LN:632
...

@tramelliwe
Copy link

tramelliwe commented Oct 3, 2024

I experience the same bug. The basecalling stopped after a GPU out-of-memory error. Trying to resume from the incomplete bam gives the same error.
The problem was indeed in the bam file header where the argument of --mm2-opts is not quoted (even if it was quoted in the original command).

I've solved this by exporting the header (samtools view -H > header.sam), quoting the --mm2-opts argument in a text editor, then replaced the header of the incomplete bam (samtools reheader header.sam $incomplete.bam > $incomplete_corrected.bam). Hope this solves the issue for you as well @kir1to455.

@HalfPhoton
Copy link
Collaborator

Yes, @tramelliwe you've identified the issue. Thank you.

@kir1to455m, we can see that the sam header (which dorado re-uses arguments from when using resume) --mm2-opts -k 15 -w 10 --secondary=no is indeed missing the required quotation marks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working resume Issues with --resume-from
Projects
None yet
Development

No branches or pull requests

4 participants