Switching between faster-whisper or openai-whisper via env seems to be broken #edit: most likely not, just not as fast as I hoped for #115

Deathproof76 · 2023-06-11T12:14:45Z

I transcribed a 22 minute mp3 file with - ASR_ENGINE=openai_whisper via the webui, timed it and it took 2:15 min. I then changed the env to - ASR_ENGINE=faster_whisper, recreated the container and it took approximately the same 2:16 min. Tried another file with the same outcome. Switched to :debug, also the same outcome. Vram consumption is also much higher than expected for faster-whisper, but in line with openai-whisper.

services:
  whisper-asr-webservice:
    #image: onerahmet/openai-whisper-asr-webservice:debug-gpu
    image: onerahmet/openai-whisper-asr-webservice:latest-gpu
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities: [gpu]
    container_name: Whisper-ASR
    environment:
      - ASR_ENGINE=openai_whisper
#      - ASR_ENGINE=faster_whisper
      - ASR_MODEL=small
    ports:
      - 9007:9000
    restart: unless-stopped

Would be glad if you could take a look at it @ahmetoner, as it seems to be broken.

Also described by other people morpheus65535/bazarr#2144

I don't know much, but it seems possible that the downloaded openai-whisper model isn't converted to the CTranslate2 model format. If that is the case, wouldn't it be possible to just download the model directly from https://huggingface.co/guillaumekln/faster-whisper-small (as an example)?

Or could it be that the wrong model is selected? Instead of loading the converted model, the openai-whisper model is loaded?

Another possibility could be that the settings differ between OpenAI-whisper and faster-whisper: Depending on the gpu fp16 could be a lot faster than fp32 openai/whisper#391 . Also beam_size SYSTRAN/faster-whisper#9 SYSTRAN/faster-whisper#172 and temperature SYSTRAN/faster-whisper#172
If these options are the reason for "underperforming" would it be possible to expose them as env variables for the docker container? (@ayancey)

The text was updated successfully, but these errors were encountered:

Deathproof76 · 2023-06-14T20:02:55Z

Okay, I tried some things. I modified the docker image and basically cut openai-whisper out, so that only the faster-whisper implementation was running. I then modified the core.py and utils.py to also deactivate the converter and than mounted a downloaded model from https://huggingface.co/guillaumekln/faster-whisper-small. And yeah, same performance.
I then played a little bit with the settings, modified compute type (float32 actually had the best performance for my gpu) added os.environ["OMP_NUM_THREADS"] = "12" (which most likely helps only with cpu) and also changed beam_size=5 to 1 and added best_of=1
segment_generator, info = model.transcribe(audio, beam_size=1, best_of=1, **options_dict)
all of that brought the time down to 1:30 min from 2:15 for the same 21 min mp3 on my rtx 3060. The small models vram usage was in line with faster-whisper float32 with 1430mb. But the quality has most likely degraded (Beam_size=5 is the recommendation), though I haven't noticed so far.

So yeah, Idk, maybe it's as fast as it can be. I'm not a coder, I just poked around. Maybe it's as slow/fast as the openai implementation because it was the other way around and only faster-whisper is used. Maybe the openai-whisper has worse and faster-whisper higher quality settings by default (Though from what I understood they where the same). The beam_size setting certainly added vram usage when upped.

This one https://github.com/m-bain/whisperX with the same whisper-asr-webservice ui would most likely be chefs kiss though 😅

https://github.com/RomanKlimov/faster-whisper-acceleration this one might actually not be so hard to integrate, but it's still above my current skill level (@ayancey could you take a quick look, maybe? 😊 )

ayancey · 2023-06-15T06:46:23Z

I'll take a look, but no promises 😅

RedFox134 · 2023-06-15T18:18:31Z

I've heard whisper-x does a better job with subtitle time stamps so I'd love to see that get added!

ayancey · 2023-10-09T21:25:44Z

@Deathproof76 if you have input on this would love to hear it: #125

Deathproof76 changed the title ~~Switching between faster-whisper or openai-whisper via env seems to be broken (docker)~~ Switching between faster-whisper or openai-whisper via env seems to be broken #edit: most likely not, just not as fast as expected Jun 14, 2023

ayancey closed this as completed Nov 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switching between faster-whisper or openai-whisper via env seems to be broken #edit: most likely not, just not as fast as I hoped for #115

Switching between faster-whisper or openai-whisper via env seems to be broken #edit: most likely not, just not as fast as I hoped for #115

Deathproof76 commented Jun 11, 2023 •

edited

Loading

Deathproof76 commented Jun 14, 2023 •

edited

Loading

ayancey commented Jun 15, 2023

RedFox134 commented Jun 15, 2023

ayancey commented Oct 9, 2023

Switching between faster-whisper or openai-whisper via env seems to be broken #edit: most likely not, just not as fast as I hoped for #115

Switching between faster-whisper or openai-whisper via env seems to be broken #edit: most likely not, just not as fast as I hoped for #115

Comments

Deathproof76 commented Jun 11, 2023 • edited Loading

Deathproof76 commented Jun 14, 2023 • edited Loading

ayancey commented Jun 15, 2023

RedFox134 commented Jun 15, 2023

ayancey commented Oct 9, 2023

Deathproof76 commented Jun 11, 2023 •

edited

Loading

Deathproof76 commented Jun 14, 2023 •

edited

Loading