Skip to content

Multi languages. #2391

Closed Answered by ryanheise
guilhermeasena32 asked this question in Q&A
Oct 15, 2024 · 1 comments · 5 replies
Discussion options

You must be logged in to vote

@guilhermeasena32 While it is true that Whisper can sometimes transcribe multiple languages in the same audio, it won't do this reliably. Whisper was trained on monolingual audio files for a range of separate languages, but probably some examples of multilingual audio files were included but incorrectly labelled as single language audio files. As a result, Whisper can sometimes output a transcript for a multilingual audio with full multilingual transcripts, but incorrectly labelling it as a single language.

Even if you are OK with the incorrect labelling, the problem is that Whisper's training data just didn't include enough examples of multiple languages in the same audio file, which is …

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@guilhermeasena32
Comment options

@phineas-pta
Comment options

@guilhermeasena32
Comment options

@ryanheise
Comment options

Answer selected by guilhermeasena32
@guilhermeasena32
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants