A somewhat opinionated speech recognition library for the browser using a WebAssembly build of Vosk
This library picks up the work done by Denis Treskunov and packages an updated Vosk WebAssembly build as an easy-to-use browser library.
Note: WebAssembly builds can target NodeJS, the browser's main thread or web workers. This library explicitly compiles Vosk to be used in a WebWorker context. If you want to use Vosk in a NodeJS application it is recommended to use the official node bindings.
Checkout the demo running in-browser speech recognition of microphone input or audio files in 13 languages.
You can install vosk-browser as a module:
$ npm i vosk-browser
You can also use a CDN like jsdelivr to add the library to your page, which will be accessible via the global variable Vosk
:
<script type="application/javascript" src="https://cdn.jsdelivr.net/npm/[email protected]/dist/vosk.js"></script>
See the README in ./lib
for API reference documentation or check out the examples folder for some ways of using the library
One of the simplest examples that assumes vosk-browser
is loaded via a script
tag. It loads the model named model.tar.gz
located in the same path as the script and starts listening to the microphone. Recognition results are logged to the console.
async function init() {
const model = await Vosk.createModel('model.tar.gz');
const recognizer = new model.KaldiRecognizer();
recognizer.on("result", (message) => {
console.log(`Result: ${message.result.text}`);
});
recognizer.on("partialresult", (message) => {
console.log(`Partial result: ${message.result.partial}`);
});
const mediaStream = await navigator.mediaDevices.getUserMedia({
video: false,
audio: {
echoCancellation: true,
noiseSuppression: true,
channelCount: 1,
sampleRate: 16000
},
});
const audioContext = new AudioContext();
const recognizerNode = audioContext.createScriptProcessor(4096, 1, 1)
recognizerNode.onaudioprocess = (event) => {
try {
recognizer.acceptWaveform(event.inputBuffer)
} catch (error) {
console.error('acceptWaveform failed', error)
}
}
const source = audioContext.createMediaStreamSource(mediaStream);
source.connect(recognizerNode);
}
window.onload = init;
- Write tests
- Automate npm publish
- Automate demo publishing
- Example with speaker model
- Better documentation