Use cases

UC1: Get WER for big 4 providers using CLI and a built-in dataset.

A command line user views Word Error Rate scores for transcriptions produced by the big 4 STT providers using a built-in dataset. This can be used to compare providers over time.

Preconditions:

Access to big 4 APIs
Normalisation rules for ground truth per language
Normalised ground truth per language
Normalisation rules for transcripts
WER algorithm per language
UI access control (e.g. login page)
Good diffing tool (build or source)

Primary flow: Select language > Submit audio test files to 4 APIs > Normalise results > Calculate WER for each result > Output results by provider.

UC2: Get WER for big 4 providers using web UI and a built-in dataset.

A web user views Word Error Rate scores for transcriptions produced by the big 4 STT providers using a built-in dataset. Same as UC1 but accessed through a web UI so that users don't have to configure their own access to the providers' APIs.

UC3: Get WER for any provider using CLI and a built-in dataset.

A developer creates a component for their chosen STT provider and is able to connect it to the framework using a documented interface. Same as UC1 with the addition of this provider.

UC4: Get WER for streams using CLI and a built-in dataset.

A user views WER for streams. Same as UC1-3, but the audio is sent to the providers as a stream. Results are converted to transcript files for WER analysis.

UC5: Get processing speed for files.

A user views the time it has taken the provider to return the transcript for an audio file, expressed as a ratio of duration to processing time.

UC6: Get latency for streams.

A user views the average time it has taken the provider to return the correct word (TBC: providers that support configurable latency).

UC: Make test data for big 4

Allows the user to submit their own test data for benchmarking the big 4 STT providers. Audio files are optimised for the providers and ground truth text files are normalised.

Preconditions:

Ground truth preparation guidelines
Audio file preparation guidelines
Normalisation rules for ground truth, per language

Primary flow: Submit audio test files to 4 APIs > Normalise results > Calculate WER for each result > Output results by provider.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly