-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when trying to use it in a one hour video #22
Comments
This error is also happening for me. I tried it with a venv using the quick start guide and the example video, and am getting the exact same error messages. I also tried the linked Colab notebook, and got the same error. Here is the full information that gets printed in the Colab doc:
And, the resulting text files are empty. |
Based on the log messages, I was able to find a quick fix. In chunk_max_new_tokens=512, to: chunk_max_new_tokens=446, This stops the above error, and allows the transcription to complete successfully. |
hey, thanks for reporting this and the PR. I'll give it a look over the next few days. It's definitely possible some code got shifted around in transformers as it's been a while since I updated this will report back here and on the PR once I have a chance to look at it! |
Error transcribing chunk 25 in video.mp4
The length of
decoder_input_ids
, including special start tokens, prompt tokens, and previous tokens, is 2, andmax_new_tokens
is 512. Thus, the combined length ofdecoder_input_ids
andmax_new_tokens
is: 514. This exceeds themax_target_positions
of the Whisper model: 448. You should either reduce the length of your prompt, or reduce the value ofmax_new_tokens
, so that their combined length is less than 448.The text was updated successfully, but these errors were encountered: