-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance faster_whisper Engine #128
Conversation
The diff is kinda messed up for some Git reason I don't understand. Using this to compare to main after 1.2 release. |
@ayancey hi! Thanks, yeah, seems codebase was updated. I'll fix conflicts. |
Here's my attempt at benchmarking and comparing between @ahmetoner's latest release (1.2.0) and this PR. I bumped the CUDA version on your fork so it would be an even comparison. Tests done with 23 second audio file using small model and faster_whisper backend. GPU is NVIDIA GeForce GTX 1650.
I don't understand it, but it looks like Ahmet fixed the problem you referred to in #127. But I couldn't find any improvements in GPU or CPU memory. |
So it sounds like the model conversion is no longer an issue, but we still want to let users choose their quantization? |
I think it's a good idea, in any case we may keep float16 as default value. For example i use int8 version of large-v2 model for my projects. |
Awesome. Once you fix the conflicts I can bug Ahmet to merge this. I'm really excited to start playing around and comparing the different options. |
@ayancey hi! I've updated my PR, you may try to add |
app/faster_whisper/core.py
Outdated
model_name = os.getenv("ASR_MODEL", "base") | ||
model_path = os.getenv("ASR_MODEL_PATH", os.path.join(os.path.expanduser("~"), ".cache", "whisper")) | ||
model_path = os.path.join(cache_path, model_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change has the potential to impact backward compatibility and may result in additional model downloads. Could you please revert to the previous lines? Once that's done, I'll proceed with the merge.
Thank you for your contribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, ill fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
👏 👏 |
Hi, developers!
This pull request introduces the following improvements and fixes:
Updated NVIDIA CUDA Container: The version of the nvidia/cuda container has been updated from 11.7.0 to 11.7.1. This was necessitated as version 11.7.0 is no longer available.
Modifications in
faster_whisper
Engine:core.py
, the environment variableASR_QUANTIZATION
has been added. This allows users to specify various quantization levels including float16, int16, int8, int8_float16, and int4. However, it should be noted that there seems to be an issue with the int4 mode which may require further investigation.Regarding issue #127, if there's an inherent bug in the original codebase, I would prefer to remove the
MAPPING
. Instead of:I suggest the following:
PS. My PR will definitely conflict with #117.