Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Percent X and transcripts get stuck with console error #66

Closed
rajeshkumaryadavdotcom opened this issue Apr 1, 2023 · 14 comments
Closed
Labels
bug Something isn't working

Comments

@rajeshkumaryadavdotcom
Copy link

Describe the bug
A clear and concise description of what the bug is.

How to reproduce
Steps or a minimal working example to reproduce the behavior
Audio with Hindi having transcribing stuck on X %

Expected behavior
A clear and concise description of what you expected to happen.

It should generate subtitles

Logs/screenshots
If applicable, add logs/screenshots to help explain your problem.
Screenshot attached, app name is screenrun.app
IMG_1838
IMG_1836

Environment

  • Transformers.js version:
  • Browser (if applicable):
  • Operating system (if applicable):
  • Other:

Additional context
Add any other context about the problem here.

@rajeshkumaryadavdotcom rajeshkumaryadavdotcom added the bug Something isn't working label Apr 1, 2023
@xenova
Copy link
Collaborator

xenova commented Apr 1, 2023

Can you try with a smaller model? Base is the largest among "tiny", "small" and "base". So, you may be running out of memory.

@rajeshkumaryadavdotcom
Copy link
Author

Tried tiny as well, for english last week it worked well for 7 min video, for 14 min hindi video it is getting stuck

It’s MacBook Air but also tried on win 10 32GB RAM i7 256SSD is this a low for this purpose?

@xenova
Copy link
Collaborator

xenova commented Apr 1, 2023

Admittedly, I haven't tested on very long non-english videos/audios. If possible, could you share the audio file? If not, could you find a YouTube video I can perhaps test with? Thanks!

@rajeshkumaryadavdotcom
Copy link
Author

This is the video link
https://youtu.be/vTaEc7KN9UM

this is the app I am trying
ScreenRun.app

@xenova
Copy link
Collaborator

xenova commented Apr 1, 2023

This is the video link https://youtu.be/vTaEc7KN9UM

this is the app I am trying ScreenRun.app

Thanks! I'll run some tests 👍

It's also worth mentioning that the creator of ScreenRun also raised an issue the other day: #54

So, it might be fixed now in the latest version (i.e., if you can update, that might fix it). If it is running in the browser, you can also try refreshing the cache (since, the models were also updated recently)

@xenova
Copy link
Collaborator

xenova commented May 15, 2023

This should be fixed in the latest release (https://www.npmjs.com/package/@xenova/transformers). I will close for now, and if you have the issue again, feel free to reopen or open a new issue.

@xenova xenova closed this as completed May 15, 2023
@kungfooman
Copy link
Contributor

I am running into the same issue:

whisper_bug.html

<body></body>
<script>
  function importFile(content) {
    return "data:text/javascript;base64," + btoa(content);
  }
  const imports = {
    "transformers": "./src/transformers.js",
    "fs": importFile("export default {};"),
    "url": importFile("export default {};"),
    "path": importFile("export default {};"),
    "stream/web": importFile("export default {};"),
    "sharp": importFile("export default {};"),
    "onnxruntime-node": importFile("export default {};"),
    "onnxruntime-web": importFile(`
      await import("https://cdnjs.cloudflare.com/ajax/libs/onnxruntime-web/1.14.0/ort.es6.min.js");
      let ONNX = globalThis.ort;
      export default ONNX;
      export {
        ONNX
      };
    `),
  };
  const importmap = document.createElement("script");
  importmap.type = "importmap";
  importmap.textContent = JSON.stringify({imports});
  document.body.appendChild(importmap);
</script>
<script type="module">
  import * as transformers from "./src/transformers.js";
  Object.assign(window, {...transformers});
</script>
<!--
  <audio id="SPEECH2TEXT_AUDIO" src="./examples/demo-site/assets/audio/jfk.wav" controls="true"></audio>
  <audio id="SPEECH2TEXT_AUDIO" src="./examples/demo-site/assets/audio/minner_fra_krigen_short.wav" controls="true"></audio>
-->
<audio id="SPEECH2TEXT_AUDIO" src="./examples/demo-site/assets/audio/minner_fra_krigen.wav" controls="true"></audio>

Run this in f12/devtools:

const pipe = await pipeline("automatic-speech-recognition");
const audioCTX = new AudioContext({
  sampleRate: 16000
});
const arrayBuffer = await (await fetch(SPEECH2TEXT_AUDIO.currentSrc)).arrayBuffer();
const decoded = await audioCTX.decodeAudioData(arrayBuffer);
const audio = decoded.getChannelData(0);
const result = await pipe(audio, {
    return_timestamps: true,
    chunk_length_s: 30,
    chunk_callback: console.log
});
console.log("result", result);

Error:

ort.es6.min.js:6 D:/a/_work/1/s/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:41 onnxruntime::ReshapeHelper::ReshapeHelper(const TensorShape &, TensorShapeVector &, bool) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{6,0,64}, requested shape:{1,6,64,64}

An error occurred during model execution: "Error: failed to call OrtRun(). error code = 6.".
sessionRun @ models.js:123
await in sessionRun (async)
seq2seq_forward @ models.js:341
forward @ models.js:1837
seq2seqRunBeam @ models.js:411
runBeam @ models.js:1817
generate @ models.js:835
await in generate (async)
generate @ models.js:1769
_call @ pipelines.js:818
await in _call (async)
closure @ core.js:62
(anonym) @ VM74:8

image

I didn't try ONNX Runtime v1.14.1 yet, so I don't know if this is a bug here or there, but I will experiment more. First I thought the chunk size was wrong, so I made it powers of 16000 and 30 * 16000 etc., but it continued to fail for some reason anyway.

@xenova
Copy link
Collaborator

xenova commented May 16, 2023

Could you post the audio file you're testing with? That error message is usually associated with OOM, but I don't see why it would happen in this case.

Can you try using whisper-tiny.en?

@kungfooman
Copy link
Contributor

Yep, I will test a bit more now 🧪 🔬

File (zipped up because GitHub doesn't like wav):

minner_fra_krigen_short.wav.zip

@kungfooman
Copy link
Contributor

Can you try using whisper-tiny.en?

That failed aswell, so currently I'm trying to build ONNX with emscripten for having debug symbols and maybe I figure out a Minimal reproducible example

@kungfooman
Copy link
Contributor

I converted yesterdays's HEAD of ONNX to ES6 and tried to debug this... but running into a same'ish problem:

Chrome:

image

But because I'm also testing around with WebGPU support and Chrome-on-Linux doesn't support WebGPU on my system yet, I had to install Firefox Nightly and Firefox-Nightly-on-Linux-via-WebGPU also doesn't work:

The *.wasm binaries I used come from here: microsoft/onnxruntime#15796 (comment)

image

It keeps figuring out the audio chunks for a while just to crash around 2 minutes later. So multiple backends crash on the same task. 🤔

@kungfooman
Copy link
Contributor

This is already closed, even though the bug still exists in HEAD. However, this PR fixes it for me:

#133

(very happy about that 😅)

Thank you very much for all the work 🥇 👍

@xenova
Copy link
Collaborator

xenova commented Jun 5, 2023

This is already closed, even though the bug still exists in HEAD. However, this PR fixes it for me:

#133

(very happy about that 😅)

Thank you very much for all the work 🥇 👍

🔥 Nice! I've done a lot more testing, and it seems to happen most (or only) on M1/M2 macs in safari. What was your testing environment? Once I understand the problem a bit more, I'll open a bug report on the onnxruntime repo.

@kungfooman
Copy link
Contributor

What was your testing environment?

All testing on Linux and I tried latest Chrome and Firefox Nightly. With PR, both work fine, without PR, both crash with the same error. If chipset matters: AMD Ryzen 5 3600 6-Core Processor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants