-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switching between faster-whisper or openai-whisper via env seems to be broken #edit: most likely not, just not as fast as I hoped for #115
Comments
Okay, I tried some things. I modified the docker image and basically cut openai-whisper out, so that only the faster-whisper implementation was running. I then modified the core.py and utils.py to also deactivate the converter and than mounted a downloaded model from https://huggingface.co/guillaumekln/faster-whisper-small. And yeah, same performance. So yeah, Idk, maybe it's as fast as it can be. I'm not a coder, I just poked around. Maybe it's as slow/fast as the openai implementation because it was the other way around and only faster-whisper is used. Maybe the openai-whisper has worse and faster-whisper higher quality settings by default (Though from what I understood they where the same). The beam_size setting certainly added vram usage when upped. This one https://github.com/m-bain/whisperX with the same whisper-asr-webservice ui would most likely be chefs kiss though 😅 https://github.com/RomanKlimov/faster-whisper-acceleration this one might actually not be so hard to integrate, but it's still above my current skill level (@ayancey could you take a quick look, maybe? 😊 ) |
I'll take a look, but no promises 😅 |
I've heard whisper-x does a better job with subtitle time stamps so I'd love to see that get added! |
@Deathproof76 if you have input on this would love to hear it: #125 |
I transcribed a 22 minute mp3 file with - ASR_ENGINE=openai_whisper via the webui, timed it and it took 2:15 min. I then changed the env to - ASR_ENGINE=faster_whisper, recreated the container and it took approximately the same 2:16 min. Tried another file with the same outcome. Switched to :debug, also the same outcome. Vram consumption is also much higher than expected for faster-whisper, but in line with openai-whisper.
Would be glad if you could take a look at it @ahmetoner, as it seems to be broken.
Also described by other people morpheus65535/bazarr#2144
I don't know much, but it seems possible that the downloaded openai-whisper model isn't converted to the CTranslate2 model format. If that is the case, wouldn't it be possible to just download the model directly from https://huggingface.co/guillaumekln/faster-whisper-small (as an example)?
Or could it be that the wrong model is selected? Instead of loading the converted model, the openai-whisper model is loaded?
Another possibility could be that the settings differ between OpenAI-whisper and faster-whisper: Depending on the gpu fp16 could be a lot faster than fp32 openai/whisper#391 . Also beam_size SYSTRAN/faster-whisper#9 SYSTRAN/faster-whisper#172 and temperature SYSTRAN/faster-whisper#172
If these options are the reason for "underperforming" would it be possible to expose them as env variables for the docker container? (@ayancey)
The text was updated successfully, but these errors were encountered: