Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add secret setup step to WhisperX tutorial #29

Merged
merged 2 commits into from
Sep 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 46 additions & 4 deletions advanced/whisperx/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,38 +11,51 @@ Similar to the Whisper JAX example, if you are running locally, we recommend you
Usually, when you run different AI models, they require specific dependencies that sometimes conflict with each other. This is particularly true in the whisper case - from `requirements.txt`, you may notice that there are quite a bit of specific version requirements.

This is where having a separate service like Lepton becomes super useful: we can create a python environment (using e.g. conda or virtualenv), installed the required dependencies, run the photon as a web service, and then in the regular python environment, simply call the web service as if we were using a regular python function. Comparing to some apparent choices:

- unlike a single python environment, we don't need to resolve version conflicts of different algorithms;
- unlike packing everything in a separate opaque container image, we are much more lightweighted: only a python environment and dependencies are needed.

## Running with a custom environment.
## Prerequisite

Note that one of the dependency relies on 3 Hugging Face Hub models that would require you to sign some terms of usage beforehand. Otherwise it will throw error. Simply proceed to the website for [Segmentation](https://huggingface.co/pyannote/segmentation) , [Voice Activity Detection (VAD)](https://huggingface.co/pyannote/voice-activity-detection) , and [Speaker Diarization](https://huggingface.co/pyannote/speaker-diarization) and sign the terms.

![Pyannote Model Term Agreement](assets/pyannote.png)

You would also need a Hugging Face Access Token at hand. Simply following the steps in the [official guide](https://huggingface.co/docs/hub/security-tokens).

## Running with a custom environment

We recommend you use conda or virtualenv to start a whisper-specific environment. For example, if you use conda, it's easy to do:

```shell
# pick a python version of your favorite
conda create -n whisperx python=3.10
conda create -n whisperx python=3.10
conda activate whisperx
```

After that, install lepton [per the installation instruction](https://www.lepton.ai/docs/overview/quickstart#1-installation), and install the required dependencies of this demo via:

```shell
pip install -r requirements.txt
```

After this, you can launch whisperx like:

```shell
# Set your huggingface token. This is required to obtain the respective models.
export HUGGING_FACE_HUB_TOKEN="replace-with-your-own-token"
python main.py
```

It will download the paramaters and start the server. After that, use the regular python client to access the model:

```python
from leptonai.client import Client, local
c = Client(local())
```

and invoke transcription or translation as follows:

```python
>> c.run(filename="assets/thequickbrownfox.wav")
[{'start': 0.028,
Expand Down Expand Up @@ -90,17 +103,44 @@ and invoke transcription or translation as follows:

## Running with Lepton

The above example runs on the local machine. If your machine does not have a public facing IP, or more commonly, you want a stable server environment to host your model - then running on the Lepton cloud platform is the best option. To run it on Lepton, you can simply create a photon and push it to the cloud:
The above example runs on the local machine. If your machine does not have a public facing IP, or more commonly, you want a stable server environment to host your model - then running on the Lepton cloud platform is the best option. To run it on Lepton, you can simply create a photon and push it to the cloud.

To have HuggingFace Hub API access function properly, we would also need it set as an available environment variable in the cloud. To do so, simply run the following command to store it as a [secret](https://www.lepton.ai/docs/advanced/env_n_secrets):

```shell
lep secret create -n HUGGING_FACE_HUB_TOKEN -v VALUE_OF_YOUR_TOKEN
```

You can run the following command to confirm that the secret is stored properly:

```shell
lep secret list
```

which should return something like below

```txt
Secrets
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ ID ┃ Value ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ HUGGING_FACE_HUB_TOKEN │ (hidden) │
└────────────────────────┴──────────┘
```

Now you can proceed to photo creation and deployment by running the following command:
Yangqing marked this conversation as resolved.
Show resolved Hide resolved

```shell
lep login
lep photon create -n whisperx -m main.py
lep photon push -n whisperx
# An A10 machine is usually big enough to run the large-v2 model.
lep photon run -n whisperx --resource-shape gpu.a10
# note you need to specify the secret that needs to be available in the run
lep photon run -n whisperx --resource-shape gpu.a10 --secret HUGGING_FACE_ACCESS_TOKEN
```

After that, you can use the `lep deployment status` to obtain the public address of the photon, and use the same slack app to connect to it:

```shell
>> lep deployment status -n whisperx
Created at: 2023-08-09 20:24:48
Expand All @@ -119,6 +159,7 @@ Replicas List:
```

To access the model, we can create a client similar to the local case, simply replace `local()` with the workspace, deployment name, and token. Also, since we are running remote now, we will need to upload the audio files. This is done by calling the `run_updload` path:

```python
>> from leptonai.client import Client
>> from leptonai.photon import FileParam
Expand Down Expand Up @@ -168,6 +209,7 @@ To access the model, we can create a client similar to the local case, simply re
```

Unlike local deployment, running on the Lepton cloud platform comes with a series of advantages, especially in the whisperx case:

- You do not need to worry about reproducible software environment. The photon is guaranteed to run on the same environment as you created it.
- Scaling is easier - you can simply increase the number of replicas if you need more capacity.
- Automatic fault tolerance - if the photon crashes, it will be automatically restarted.
Expand Down
Binary file added advanced/whisperx/assets/pyannote.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.