Download Model #63

AshiqAbdulkhader · 2022-09-23T09:35:17Z

AshiqAbdulkhader
Sep 23, 2022

How to download the models and load it offline

Sep 23, 2022

Running the script the first time for a model will download that specific model; it stores (on windows) the model at C:\Users\<username>\.cache\whisper\<model>.

Once downloaded, the model doesn't need to be downloaded again. Considering that the medium model alone is ~1.5 GB, that seems to be the best solution.

View full answer

werelord · 2022-09-23T14:09:51Z

werelord
Sep 23, 2022

Running the script the first time for a model will download that specific model; it stores (on windows) the model at C:\Users\<username>\.cache\whisper\<model>.

Once downloaded, the model doesn't need to be downloaded again. Considering that the medium model alone is ~1.5 GB, that seems to be the best solution.

5 replies

Theo1996 Oct 16, 2022

Can I change the download directory or the dir the model is loaded from? I dont know python...

jongwook Oct 17, 2022
Maintainer

@Theo1996 If you're using the command line, you can specify --model_dir {path to the directory to download models}. In Python whisper.load_model(..., download_root="{path to the directory to download models}") will do the same.

Theo1996 Oct 18, 2022

@jongwook --model_dirdoesnt work
whisper: error: unrecognized arguments: --model_dir when I try to use whisper and no such option: --model_dir when I try to install it.

Theo1996 Oct 19, 2022

@jongwook I needed the latest commit, now it works.

vivekuppal Jun 29, 2023

Can also use the command line option --model_dir to save models to a specific folder

--model_dir MODEL_DIR
the path to save model files; uses ~/.cache/whisper by default (default: None)

nicholasgcotton · 2022-10-04T18:12:19Z

nicholasgcotton
Oct 4, 2022

Download links are in init.py @ lines 17-27

For windows location to download see the above comment, on WSL/Linux they go in
\\wsl$\Ubuntu\home\[username]\.cache\whisper

Note if you are experimenting with both WSL and Windows Native versions you need to put the models in BOTH locations.

As of today those links are:

11 replies

islogged Jan 1, 2023

Well, I think this is kind of expected, it is a neural network modelled after how a brain works. Someone who speaks 5 languages doesn't have a 5 times larger brain compared to someone who speaks only one language. The number of neurons is roughly the same, and thus the number of parameters. They simply have different values.
If you look at this that way, it is also understandable that the larger models have fewer and fewer differences between their English capabilities.

Hum, ok !
Thx for ur answer ...

But more strange enough, on my side i got better results with the multilingual pack for transcribe English than the dedicate English one ...

navakelvin Jan 19, 2023

Download links are in init.py @ lines 17-27

For windows location to download see the above comment, on WSL/Linux they go in \\wsl$\Ubuntu\home\[username]\.cache\whisper

Note if you are experimenting with both WSL and Windows Native versions you need to put the models in BOTH locations.

As of today those links are:

tiny.en

tiny

base.en

base

small.en

small

medium.en

medium

large

Hi sir, i've downloaded this but i get this error:

UserWarning: C:\Users\myuser\.cache\whisper\medium.pt exists, but the SHA256 checksum does not match; re-downloading the file
  warnings.warn(f"{download_target} exists, but the SHA256 checksum does not match; re-downloading the file")

zhuiyi6 Jun 15, 2023

i download large-v2 model,but i use this model.
there are so many small files in this large-v2.zip,how to convert to xx.pt type to use it

cpfy Sep 14, 2023

@navakelvin I used the link from init.py in the main branch and it worked. But I don't know why, the links seem to be exactly the same.

rokayabencheikh Dec 20, 2023

hello i download the model small.en.pt but it didn't calculate the wer metric is there any idea to add it ?

msliczniak · 2022-10-04T18:27:34Z

msliczniak
Oct 4, 2022

to find the URL examine MODELS:

site-packages/whisper/init.py

use that URL in the cache dir:

.cache/whisper % curl -L -O -C - …

0 replies

ghost · 2022-12-16T15:29:05Z

ghost
Dec 16, 2022

It would be nice to have the option to download the model like Spacy with python -m spacy download $SPACY_MODEL to avoid using hard-coded URLs in docker file for instance. Then You have only to use python -m spacy download $WHISPER_MODEL.

5 replies

nezhar Dec 30, 2022

Some code is required, but it can be done:

import sys
from whisper import _download, _MODELS

_download(_MODELS["base"], "/models/", False)

This will also make sure the models are downloaded correctly or already exist in the given model directory in the correct version.

mayeaux Mar 11, 2023

To download all models at once:

import sys
from whisper import _download, _MODELS

models = ["tiny.en", "tiny", "base.en", "base", "small.en", "small", "medium.en", "medium", "large"]

for model in models:
    _download(_MODELS[model], "~/.cache/whisper", False)

TFWol Aug 19, 2023

Tweaked @mayeaux's useful code so that it gets the entire list direct from whisper (more future-proof in case list changes)

import sys
from whisper import _download, _MODELS

models = whisper.available_models()

for model in models:
    _download(_MODELS[model], "~/.cache/whisper", False)

arrij46 Jul 15, 2024

How do I load this downloaded model?

afsara-ben Sep 18, 2024

i wonder why the tiny downlaoded model is 75.6 MB in size and not 151 MB

gaurav21r · 2023-03-06T17:59:54Z

gaurav21r
Mar 6, 2023

For all those wondering for Mac
its at /Users/<UserName>/.cache/whisper

You can see your .cache folder by pressing Cmd+Shift+. That's a dot (period) at the end!

😛 I used this opportunity to clean out some other stuff too from my .cache

0 replies

TomBerton · 2023-04-13T02:38:22Z

TomBerton
Apr 13, 2023

How can I add a model to a docker image? Do I add a new folder with the model, like this; .cache/whisper/<model> in the directory for my docker image, or a folder with the model; whisper/<model>?

5 replies

patchworquill Apr 28, 2023

Also wondering this. I'd like to download my models at build time of the docker container, then reference them in my app.py file from the download directory.

javelintechML May 17, 2023

anyone figure this out ?

zallesov Jun 12, 2023

Came here for the same reason.
tried to run whisper.load_model(model_name) at the build time but since building is happeing on a different machine without cuda this command fails.

nicholasgcotton Jun 12, 2023

You can put the model to wherever you want and specify that path with "--model_dir PATH" for both the .exe and python versions.

alfranz Dec 10, 2023

Just import whisper and download a model during buildtime:

RUN python -c "import whisper; whisper.load_model('base')"

Whisperskies · 2023-05-09T00:29:28Z

Whisperskies
May 9, 2023

I have a question! It says "Once downloaded, the model doesn't need to be downloaded again.". But i have to download the model everytime i transcribe an audio.. is it because i have to run the program as administrator?
My .cache folder keeps deleting everytime i restart the computer, and i need to download again the models everytime, did it happend to someoe as well, and what can i do? thanks

1 reply

ssary Jun 11, 2024

So you can change the download directory to some other directory

magicse · 2023-11-14T17:57:01Z

magicse
Nov 14, 2023

        if whisper_model is None:
            whisper_model = whisper.load_model("medium", download_root=os.path.join(os.getcwd(), "your_custom_dir"))

0 replies

Aikon404 · 2024-01-04T11:00:43Z

Aikon404
Jan 4, 2024

You must change the init.py in "whisper-main\whisper"

Change these segments in code

os.makedirs(os.path.join(root, "your Path"), exist_ok=True)

download_target = os.path.join(root, "your Path", file_name)

    default = os.path.join(os.path.dirname(os.path.abspath(__file__)), "your Path")

in the folowing code is not "your Path" it is "models"

"

import hashlib
import io
import os
import urllib
import warnings
from typing import List, Optional, Union

import torch
from tqdm import tqdm

from .audio import load_audio, log_mel_spectrogram, pad_or_trim
from .decoding import DecodingOptions, DecodingResult, decode, detect_language
from .model import ModelDimensions, Whisper
from .transcribe import transcribe
from .version import version

base85-encoded (n_layers, n_heads) boolean arrays indicating the cross-attention heads that are

highly correlated to the word-level timing, i.e. the alignment between audio and text tokens.

ALIGNMENT_HEADS = {
"tiny.en": b"ABzY8J1N>@0{>%R00Bk>$p{7v037oCl~+#00", "tiny": b"ABzY8bu8Lr0{>%RKn9Fp%m@SkK7Kt=7ytkO", "base.en": b"ABzY8;40c<0{>%RzzG;p*o+Vo09|#PsxSZm00", "base": b"ABzY8KQ!870{>%RzyTQH3Q^yNP!>##QT-<FaQ7m",
"small.en": b"ABzY8>?)10{>%RpeA61k&I|OI3I$65C{;;pbCHh0B{qLQ;+}v00",
"small": b"ABzY8DmU6=0{>%Rpa?J`kvJ6qF(V^F86#Xh7JUGMK}P<N0000",
"medium.en": b"ABzY8usPae0{>%R7<zz_OvQ{)4kMa0BMw6u5rT}kRKX;$NfYBv00Hl@qhsU00",
"medium": b"ABzY8B0Jh+0{>%R7}kK1fFL7w6%<-Pft^=N)Qr&0RR9",
"large-v1": b"ABzY8r9j$a0{>%R7#4sLmoOs{s)o384-RPdcFk!JR<kSfC2yj",
"large-v2": b"ABzY8zd+h!0{>%R7=D0pU<_bnWW*tkYAhobTNnu$jnkEkXqp)j;w1Tzk)UH3X%SZd&fFZ2fC2yj",
"large-v3": b"ABzY8gWO1E0{>%R7(9S+Kn!D%ngiGaR?*L!iJG9p-nab0JQ=-{D1-g00",
"large": b"ABzY8gWO1E0{>%R7(9S+Kn!D~%ngiGaR?*L!iJG9p-nab0JQ=-{D1-g00",
}

def _download(url: str, root: str, in_memory: bool) -> Union[bytes, str]:
os.makedirs(os.path.join(root, "models"), exist_ok=True)

expected_sha256 = url.split("/")[-2]
file_name = os.path.basename(url)
download_target = os.path.join(root, "models", file_name)

if os.path.exists(download_target) and not os.path.isfile(download_target):
    raise RuntimeError(f"{download_target} exists and is not a regular file")

if os.path.isfile(download_target):
    with open(download_target, "rb") as f:
        model_bytes = f.read()
    if hashlib.sha256(model_bytes).hexdigest() == expected_sha256:
        return model_bytes if in_memory else download_target
    else:
        warnings.warn(
            f"{download_target} exists, but the SHA256 checksum does not match; re-downloading the file"
        )

with urllib.request.urlopen(url) as source, open(download_target, "wb") as output:
    with tqdm(
        total=int(source.info().get("Content-Length")),
        ncols=80,
        unit="iB",
        unit_scale=True,
        unit_divisor=1024,
    ) as loop:
        while True:
            buffer = source.read(8192)
            if not buffer:
                break

            output.write(buffer)
            loop.update(len(buffer))

model_bytes = open(download_target, "rb").read()
if hashlib.sha256(model_bytes).hexdigest() != expected_sha256:
    raise RuntimeError(
        "Model has been downloaded but the SHA256 checksum does not match. Please retry loading the model."
    )

return model_bytes if in_memory else download_target

def available_models() -> List[str]:
"""Returns the names of available models"""
return list(_MODELS.keys())

def load_model(
name: str,
device: Optional[Union[str, torch.device]] = None,
download_root: str = None,
in_memory: bool = False,
) -> Whisper:
"""
Load a Whisper ASR model

Parameters
----------
name : str
    one of the official model names listed by `whisper.available_models()`, or
    path to a model checkpoint containing the model dimensions and the model state_dict.
device : Union[str, torch.device]
    the PyTorch device to put the model into
download_root: str
    path to download the model files; by default, it uses "~/.cache/whisper"
in_memory: bool
    whether to preload the model weights into host memory

Returns
-------
model : Whisper
    The Whisper ASR model instance
"""

if device is None:
    device = "cuda" if torch.cuda.is_available() else "cpu"
if download_root is None:
    default = os.path.join(os.path.dirname(os.path.abspath(__file__)), "models")
    download_root = os.path.join(default, "whisper")

if name in _MODELS:
    checkpoint_file = _download(_MODELS[name], download_root, in_memory)
    alignment_heads = _ALIGNMENT_HEADS[name]
elif os.path.isfile(name):
    checkpoint_file = open(name, "rb").read() if in_memory else name
    alignment_heads = None
else:
    raise RuntimeError(
        f"Model {name} not found; available models = {available_models()}"
    )

with (
    io.BytesIO(checkpoint_file) if in_memory else open(checkpoint_file, "rb")
) as fp:
    checkpoint = torch.load(fp, map_location=device)
del checkpoint_file

dims = ModelDimensions(**checkpoint["dims"])
model = Whisper(dims)
model.load_state_dict(checkpoint["model_state_dict"])

if alignment_heads is not None:
    model.set_alignment_heads(alignment_heads)

return model.to(device)

"

0 replies

rokayabencheikh · 2024-01-04T11:30:31Z

rokayabencheikh
Jan 4, 2024

hello thank you for your response, now i need a model.pt can do transcription and calculate wer but all models i download can not give the wer do you have any idea or the links of models you sent can do that ? thank you so much for your help.

…

On Thu, Jan 4, 2024 at 12:00 PM Aikon404 ***@***.***> wrote: You must change the *init*.py in "whisper-main\whisper" Change these segments in code os.makedirs(os.path.join(root, "your Path"), exist_ok=True) download_target = os.path.join(root, "your Path", file_name) default = os.path.join(os.path.dirname(os.path.abspath(__file__)), "your Path") in the folowing code is not "your Path" it is "models" " import hashlib import io import os import urllib import warnings from typing import List, Optional, Union import torch from tqdm import tqdm from .audio import load_audio, log_mel_spectrogram, pad_or_trim from .decoding import DecodingOptions, DecodingResult, decode, detect_language from .model import ModelDimensions, Whisper from .transcribe import transcribe from .version import *version* _MODELS = { "tiny.en": " https://openaipublic.azureedge.net/main/whisper/models/d3dd57d32accea0b295c96e26691aa14d8822fac7d9d27d5dc00b4ca2826dd03/tiny.en.pt ", "tiny": " https://openaipublic.azureedge.net/main/whisper/models/65147644a518d12f04e32d6f3b26facc3f8dd46e5390956a9424a650c0ce22b9/tiny.pt ", "base.en": " https://openaipublic.azureedge.net/main/whisper/models/25a8566e1d0c1e2231d1c762132cd20e0f96a85d16145c3a00adf5d1ac670ead/base.en.pt ", "base": " https://openaipublic.azureedge.net/main/whisper/models/ed3a0b6b1c0edf879ad9b11b1af5a0e6ab5db9205f891f668f8b0e6c6326e34e/base.pt ", "small.en": " https://openaipublic.azureedge.net/main/whisper/models/f953ad0fd29cacd07d5a9eda5624af0f6bcf2258be67c92b79389873d91e0872/small.en.pt ", "small": " https://openaipublic.azureedge.net/main/whisper/models/9ecf779972d90ba49c06d968637d720dd632c55bbf19d441fb42bf17a411e794/small.pt ", "medium.en": " https://openaipublic.azureedge.net/main/whisper/models/d7440d1dc186f76616474e0ff0b3b6b879abc9d1a4926b7adfa41db2d497ab4f/medium.en.pt ", "medium": " https://openaipublic.azureedge.net/main/whisper/models/345ae4da62f9b3d59415adc60127b97c714f32e89e936602e85993674d08dcb1/medium.pt ", "large-v1": " https://openaipublic.azureedge.net/main/whisper/models/e4b87e7e0bf463eb8e6956e646f1e277e901512310def2c24bf0e11bd3c28e9a/large-v1.pt ", "large-v2": " https://openaipublic.azureedge.net/main/whisper/models/81f7c96c852ee8fc832187b0132e569d6c3065a3252ed18e56effd0b6a73e524/large-v2.pt ", "large-v3": " https://openaipublic.azureedge.net/main/whisper/models/e5b1a55b89c1367dacf97e3e19bfd829a01529dbfdeefa8caeb59b3f1b81dadb/large-v3.pt ", "large": " https://openaipublic.azureedge.net/main/whisper/models/e5b1a55b89c1367dacf97e3e19bfd829a01529dbfdeefa8caeb59b3f1b81dadb/large-v3.pt ", } base85-encoded (n_layers, n_heads) boolean arrays indicating the cross-attention heads that are highly correlated to the word-level timing, i.e. the alignment between audio and text tokens. *ALIGNMENT_HEADS = { "tiny.en": ***@***.*** <https://github.com/0>{>%R00Bk>$p{7v037oCl~+#00", "tiny": ***@***.***=7ytkO", "base.en": b"ABzY8;40c<0{>%RzzG;p*o+Vo09|#PsxSZm00", "base": b"ABzY8KQ!870{>%RzyTQH3Q^yNP!>##QT-<FaQ7m", "small.en": b"ABzY8>?* )10{>%RpeA61k&I|OI3I$65C{;;pbCHh0B{qLQ;+}v00", "small": b"ABzY8DmU6=0{>%Rpa?J`kvJ6qF(V^F86#Xh7JUGMK}P<N0000", "medium.en": b"ABzY8usPae0{>%R7<zz_OvQ{)4kMa0BMw6u5rT}kRKX;$NfYBv00 ***@***.***", "medium": b"ABzY8B0Jh+0{>%R7}kK1fFL7w6%<-Pf*t^=N)Qr&0RR9", "large-v1": b"ABzY8r9j$a0{>%R7#4sLmoOs{s)o384-RPdcFk!JR<kSfC2yj", "large-v2": b"ABzY8zd+h!0{>%R7=D0pU<_bnWW*tkYAhobTNnu$jnkEkXqp)j;w1Tzk)UH3X%SZd&fFZ2fC2yj", "large-v3": b"ABzY8gWO1E0{>%R7(9S+Kn!D%ngiGaR?*L!iJG9p-nab0JQ=-{D1-g00", "large": b"ABzY8gWO1E0{>%R7(9S+Kn!D~%ngiGaR?*L!iJG9p-nab0JQ=-{D1-g00", } def _download(url: str, root: str, in_memory: bool) -> Union[bytes, str]: os.makedirs(os.path.join(root, "models"), exist_ok=True) expected_sha256 = url.split("/")[-2] file_name = os.path.basename(url) download_target = os.path.join(root, "models", file_name) if os.path.exists(download_target) and not os.path.isfile(download_target): raise RuntimeError(f"{download_target} exists and is not a regular file") if os.path.isfile(download_target): with open(download_target, "rb") as f: model_bytes = f.read() if hashlib.sha256(model_bytes).hexdigest() == expected_sha256: return model_bytes if in_memory else download_target else: warnings.warn( f"{download_target} exists, but the SHA256 checksum does not match; re-downloading the file" ) with urllib.request.urlopen(url) as source, open(download_target, "wb") as output: with tqdm( total=int(source.info().get("Content-Length")), ncols=80, unit="iB", unit_scale=True, unit_divisor=1024, ) as loop: while True: buffer = source.read(8192) if not buffer: break output.write(buffer) loop.update(len(buffer)) model_bytes = open(download_target, "rb").read() if hashlib.sha256(model_bytes).hexdigest() != expected_sha256: raise RuntimeError( "Model has been downloaded but the SHA256 checksum does not match. Please retry loading the model." ) return model_bytes if in_memory else download_target def available_models() -> List[str]: """Returns the names of available models""" return list(_MODELS.keys()) def load_model( name: str, device: Optional[Union[str, torch.device]] = None, download_root: str = None, in_memory: bool = False, ) -> Whisper: """ Load a Whisper ASR model Parameters ---------- name : str one of the official model names listed by `whisper.available_models()`, or path to a model checkpoint containing the model dimensions and the model state_dict. device : Union[str, torch.device] the PyTorch device to put the model into download_root: str path to download the model files; by default, it uses "~/.cache/whisper" in_memory: bool whether to preload the model weights into host memory Returns ------- model : Whisper The Whisper ASR model instance """ if device is None: device = "cuda" if torch.cuda.is_available() else "cpu" if download_root is None: default = os.path.join(os.path.dirname(os.path.abspath(__file__)), "models") download_root = os.path.join(default, "whisper") if name in _MODELS: checkpoint_file = _download(_MODELS[name], download_root, in_memory) alignment_heads = _ALIGNMENT_HEADS[name] elif os.path.isfile(name): checkpoint_file = open(name, "rb").read() if in_memory else name alignment_heads = None else: raise RuntimeError( f"Model {name} not found; available models = {available_models()}" ) with ( io.BytesIO(checkpoint_file) if in_memory else open(checkpoint_file, "rb") ) as fp: checkpoint = torch.load(fp, map_location=device) del checkpoint_file dims = ModelDimensions(**checkpoint["dims"]) model = Whisper(dims) model.load_state_dict(checkpoint["model_state_dict"]) if alignment_heads is not None: model.set_alignment_heads(alignment_heads) return model.to(device) " — Reply to this email directly, view it on GitHub <#63 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AT3UJLFUILJ2QF5WW2AQVVDYM2DWTAVCNFSM6AAAAAAQTZ4VH2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DAMJRGU3TM> . You are receiving this because you commented.Message ID: ***@***.***>

1 reply

Aikon404 Jan 4, 2024

no sorry, I only have the models that we got from openai

rokayabencheikh · 2024-01-04T14:57:28Z

rokayabencheikh
Jan 4, 2024

Ok, thanks anyway.

…

On Thu, Jan 4, 2024 at 2:47 PM Aikon404 ***@***.***> wrote: no sorry, I only have the models that we got from openai — Reply to this email directly, view it on GitHub <#63 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AT3UJLBRLWGK34SUCMTNC6TYM2XHRAVCNFSM6AAAAAAQTZ4VH2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DAMJSHE2DE> . You are receiving this because you commented.Message ID: ***@***.***>

0 replies

aminsalahhossny · 2024-11-06T22:19:02Z

aminsalahhossny
Nov 6, 2024

checkpoint = torch.load(fp, map_location=device)

0 replies

Download Model #63

Replies: 12 comments · 28 replies

jongwook Oct 17, 2022 Maintainer

base85-encoded (n_layers, n_heads) boolean arrays indicating the cross-attention heads that are

highly correlated to the word-level timing, i.e. the alignment between audio and text tokens.

Replies: 12 comments 28 replies

jongwook Oct 17, 2022
Maintainer