GitHub - Nyralei/whisperx-api-server: FastAPI server for WhisperX transcription library

Overview

WhisperX API Server is a FastAPI-based server designed to transcribe audio files using the Whisper ASR (Automatic Speech Recognition) model based on WhisperX (https://github.com/m-bain/WhisperX) Python library. The API offers an OpenAI-like interface that allows users to upload audio files and receive transcription results in various formats. It supports customizable options such as different models, languages, temperature settings, and more.

Features

Audio Transcription: Transcribe audio files using the Whisper ASR model.
Model Caching: Load and cache models for reusability and faster performance.
OpenAI-like API, based on https://platform.openai.com/docs/api-reference/audio/createTranscription

API Endpoints

`POST /v1/audio/transcriptions`

This is the main endpoint for uploading audio files and receiving transcriptions.

Parameters:

file: The audio file to transcribe.
model (str): The Whisper model to use. Default is config.whisper.model.
language (str): The language for transcription. Default is config.default_language.
prompt (str): Optional transcription prompt.
response_format (str): The format of the transcription output. Defaults to json.
temperature (float): Temperature setting for transcription. Default is 0.0.
timestamp_granularities (list): Granularity of timestamps, either segment or word. Default is ["segment"]. Currently doesn't work with OpenAI client libraries.
stream (bool): Enable streaming mode for real-time transcription. WIP.
hotwords (str): Optional hotwords for transcription.
suppress_numerals (bool): Option to suppress numerals in the transcription. Default is True.
highlight_words (bool): Highlight words in the transcription output for formats like VTT and SRT.

Returns: Transcription results in the specified format.

`GET /healthcheck`

Returns the current health status of the API server.

`GET /models/list`

Lists all loaded models currently available on the server.

`POST /models/unload`

Unloads a specific model from memory cache.

`POST /models/load`

Loads a specified model into memory.

Running the API

With Docker:

For CPU:

    docker compose build whisperx-api-server-cpu

    docker compose up whisperx-api-server-cpu

For CUDA (GPU):

    docker compose build whisperx-api-server-cuda

    docker compose up whisperx-api-server-cuda

Contributing

Feel free to submit issues, fork the repository, and send pull requests to contribute to the project.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE Version 3. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
src/whisperx_api_server		src/whisperx_api_server
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile.cpu		Dockerfile.cpu
Dockerfile.cuda		Dockerfile.cuda
LICENSE		LICENSE
README.md		README.md
compose.yaml		compose.yaml
requirements-cpu.txt		requirements-cpu.txt
requirements-cuda.txt		requirements-cuda.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

API Endpoints

`POST /v1/audio/transcriptions`

`GET /healthcheck`

`GET /models/list`

`POST /models/unload`

`POST /models/load`

Running the API

Contributing

License

About

Releases 6

Packages

Languages

License

Nyralei/whisperx-api-server

Folders and files

Latest commit

History

Repository files navigation

Overview

API Endpoints

POST /v1/audio/transcriptions

GET /healthcheck

GET /models/list

POST /models/unload

POST /models/load

Running the API

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases 6

Packages 0

Languages

`POST /v1/audio/transcriptions`

`GET /healthcheck`

`GET /models/list`

`POST /models/unload`

`POST /models/load`

Packages