Previous work done in this space

Normalisation rules challenge: https://www.kaggle.com/headsortails/watch-your-language-update-feature-engineering/report

TextAV 2018

from one of the textAV event problem domain a group focused on defining specs for a STT benchmarking tool

`Franck-Dernoncourt/ASR_benchmark`

The Speech Recognition Benchmark is a program that assesses and compares the performances of automated speech recognition (ASR) APIs. It runs on Mac OS X, Microsoft Windows and Ubuntu. It currently supports the following ASR APIs: Amazon Lex, Google, Google Cloud, Houndify, IBM Watson, Microsoft (a.k.a. Bing), Speechmatics and Wit.

Franck-Dernoncourt/ASR_benchmark

`Picovoice/stt-benchmark`

Made in Vancouver, Canada by Picovoice

This is a minimalist and extensible framework for benchmarking different speech-to-text engines. It has been developed and tested on Ubuntu with Python3.

Picovoice/stt-benchmark

Mozilla DeepSpeech

this issue in their repository would sudgest the system has a WER component/functionality

`pietrop/STT-Services-comparator`

Tried out a few word level diff libraries - as expected they all give slightly different results.

pietrop/STT-Services-comparator

AssemblyAI Benchmarking script

Asked Dylan from AssemblyAI how they perform their WER - he said:

We do a couple of things to compare. When we have the ground truth, we look for the WER using this WER algorithm

dylanbfox/wer.py

We also normalize the text by lowercasing everything, removing all punctuation, and converting numbers to written form (eg, "7" -> "seven") because different engines return numbers differently. Some write them out (like us) and some transform them to symbol format.

WER algorithm - Word Error Rate Calculation

BBC R&D STT benchmarking

Blog post about BBC R&D STT benchmarking tool. <-- there is interest in open sourcing scripts and sharing notes.

IRFS Weeknotes #272

`fast-levenshtein`

Levenshtein distance is a string metric for measuring the difference between two sequences. wikipedia

Used in the context of calculating WER.

An efficient Javascript implementation of the Levenshtein algorithm with locale-specific collator support.

fast-levenshtein

`difflib`

A JavaScript module which provides classes and functions for comparing sequences. It can be used for example, for comparing files, and can produce difference information in various formats, including context and unified diffs. Ported from Python's difflib module.

npm difflib

This module provides classes and functions for comparing sequences. It can be used for example, for comparing files, and can produce difference information in various formats, including HTML and context and unified diffs. For comparing directories and files, see also, the filecmp module.

based on python difflib

`google/diff-match-patch`

Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.

`NIST scoring system`

sclite: alignment engine used to "align" errorful hypothesized texts, such as output from an ASR system, to the correct reference texts. After alignment, sclite generates a veriety of summary as well as detailed scoring reports. Bundled with the CMU-Cambridge Statistical Language Modeling Toolkit v2. The toolkit is used to compute word-weights based on an N-gram language model.
sc_stats: compares system performance between more than one system. Inter-System comparisons are made by running tests paired-comparison statistical significance tests.
Rover: combines ASR system outputs into a composite Word Transition network which is then searched an scored to retrieve the best scoring word sequence.

MGB Challenge

The Multi-Genre Broadcast (MGB) Challenge is an evaluation of speech recognition, speaker diarization, dialect detection and lightly supervised alignment using TV recordings in English and Arabic.

`@bbc/react-transcript-editor`

A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs.

Github repository
demo click load demo

Text normalization competition

https://www.kaggle.com/headsortails/watch-your-language-update-feature-engineering/report

Provide feedback

Saved searches

Use saved searches to filter your results more quickly