Multi languages. #2391
Multi languages.
#2391
-
Hello everyone, I'm testing Whisper using an audio file with multiple languages and would like to know if it's possible to have it return the detected language for each segment. |
Beta Was this translation helpful? Give feedback.
Answered by
ryanheise
Oct 24, 2024
Replies: 1 comment 5 replies
-
whisper doesn't support audio with multiple lang |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@guilhermeasena32 While it is true that Whisper can sometimes transcribe multiple languages in the same audio, it won't do this reliably. Whisper was trained on monolingual audio files for a range of separate languages, but probably some examples of multilingual audio files were included but incorrectly labelled as single language audio files. As a result, Whisper can sometimes output a transcript for a multilingual audio with full multilingual transcripts, but incorrectly labelling it as a single language.
Even if you are OK with the incorrect labelling, the problem is that Whisper's training data just didn't include enough examples of multiple languages in the same audio file, which is …