From 06c54b5c993697c879c74c773035599a1aaeff8a Mon Sep 17 00:00:00 2001 From: Nikita Savelyev Date: Thu, 25 Jan 2024 10:51:49 +0100 Subject: [PATCH] Tweaked model loading text --- notebooks/267-distil-whisper-asr/267-distil-whisper-asr.ipynb | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/notebooks/267-distil-whisper-asr/267-distil-whisper-asr.ipynb b/notebooks/267-distil-whisper-asr/267-distil-whisper-asr.ipynb index cb68e5bdedf..d8b003d9d73 100644 --- a/notebooks/267-distil-whisper-asr/267-distil-whisper-asr.ipynb +++ b/notebooks/267-distil-whisper-asr/267-distil-whisper-asr.ipynb @@ -70,7 +70,9 @@ "## Load PyTorch model\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", - "The `AutoModelForSpeechSeq2Seq.from_pretrained` method is used for the initialization of PyTorch Whisper model using the transformers library. We will use the `distil-whisper/distil-large-v2` model as an example in this tutorial. The model will be downloaded once during first run and this process may require some time. More details about this model can be found in [model_card](https://huggingface.co/distil-whisper/distil-large-v2).\n", + "The `AutoModelForSpeechSeq2Seq.from_pretrained` method is used for the initialization of PyTorch Whisper model using the transformers library. By default, we will use the `distil-whisper/distil-large-v2` model as an example in this tutorial. The model will be downloaded once during first run and this process may require some time.\n", + "\n", + "You may also choose other models from [Distil-Whisper hugging face collection](https://huggingface.co/collections/distil-whisper/distil-whisper-models-65411987e6727569748d2eb6) such as `distil-whisper/distil-medium.en` or `distil-whisper/distil-small.en`. Models of the original Whisper architecture are also available, more on them [here](https://huggingface.co/openai).\n", "\n", "Preprocessing and post-processing are important in this model use. `AutoProcessor` class used for initialization `WhisperProcessor` is responsible for preparing audio input data for the model, converting it to Mel-spectrogram and decoding predicted output token_ids into string using tokenizer." ]