Transcribe

This prototype demonstrates the potential of local AI models for speech-to-text transcription, offering a cost-effective and privacy-friendly solution. Running directly in the browser, it eliminates the need for complicated setups or expensive services. However, transcription can be slow when using larger models.

Transcribe is based on Whisper Web, built with Transformers.js, using ONNX Whisper models from Hugging Face. Whisper is a open-source speech recognition model developed by OpenAI.

Live Demo: https://stekhn.github.io/transcribe/

Usage

Clone the repository git clone [email protected]:stekhn/transcribe.git
Install dependencies npm install
Start development server npm run dev
Build the website npm run build

The project requires Node.js to run locally. The development server runs on http://localhost:5173/transcribe/.

Firefox users might need to change the dom.workers.modules.enabled setting in about:config to true to enable Web Workers. Check out this issue for more details.

Configuration

Configure the most important settings in the ./src/config.ts file.

Update the list of available Whisper models and the default model:

export const DEFAULT_MODEL = "onnx-community/whisper-tiny";

export const MODELS: { [key: string]: number } = {
    "onnx-community/whisper-tiny": 120,
    "onnx-community/whisper-base": 206,
    "onnx-community/whisper-small": 586,
};

The numeric value is the size of the model in Megabytes. Models must be provided as ONNX files. You can find suitable ONNX Whisper models on Hugging Face. Optimum is a great tool for converting models to ONNX. Additionally, the ONNX community provides great tutorials on how to create ONNX models from various machine learning frameworks.

Small warning: Using very large models (> 500 MB) will likely lead to memory issues.

Update the list of Whisper languages and update the default language:

export const DEFAULT_LANGUAGE = "en";

export const LANGUAGES: { [key: string]: string } = {
    en: "english",
    fr: "french",
    de: "german",
    es: "spanish",
};

See the full list of supported languages by Whisper. Though, it must me said that smaller languages are not well supported by small Whisper models, resulting in bad speech recognition quality. For those smaller languages or if performance is key, you might want to look into training your own Distil-Whisper model.

Deployment

Create a production build of the web application:

npm run build

Add the build folder ./dist to Git:

git add dist -f

Create a commit:

git commit -m "Add build"

Push local changes to Github:

git subtree push --prefix dist origin gh-pages

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
dist		dist
public		public
src		src
.eslintignore		.eslintignore
.eslintrc		.eslintrc
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.cjs		postcss.config.cjs
tailwind.config.cjs		tailwind.config.cjs
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcribe

Usage

Configuration

Deployment

About

Languages

License

stekhn/transcribe

Folders and files

Latest commit

History

Repository files navigation

Transcribe

Usage

Configuration

Deployment

About

Resources

License

Stars

Watchers

Forks

Languages