Skip to content

Training and development

Col·lectivaT edited this page Nov 28, 2018 · 8 revisions

In this section we explain how to install the training and development tools for CMUSphinx. As opposed to the casual usage of pocketsphinx, it is necessary to download and compile the CMUSphinx code-base for executing the advanced tasks such as training and adapting acoustic models.

Installation

For the following steps, we need to download and compile three of the CMUSphinx tools: namely sphinxbase, sphinxtrain as well as pocketsphinx. The most up-to-date versions are in github. Hence simply clone the three repositories.

git clone https://github.com/cmusphinx/sphinxbase
git clone https://github.com/cmusphinx/sphinxtrain
git clone https://github.com/cmusphinx/pocketsphinx

Debian Linux

First we make sure that the requirements are installed

sudo apt-get install pkg-config autoconf make automake libtool bison python3-dev

Then step by step we will compile the tools in the order: sphinxbase, sphinxtrain, pocketsphinx. The order is only important for sphinxbase, which needs to come first.

Below we have the compilation commands for a local installation, hence with the use of prefixes. Otherwise do a system-wide install as a superuser.

sphinxbase

./autogen.sh --prefix=~/sphinx/local
make
make check
make install

sphinxtrain

./autogen.sh --prefix=~/sphinx/local --with-sphinxbase=<your-sphinxbase-dir>
make
make check
make install

pocketsphinx

./autogen.sh --prefix=~/sphinx/local --with-sphinxbase=<your-sphinxbase-dir>
make
make check
make install

If you compiled them with a prefix then as a final step you need to add these paths to your ./profile. Just use a text editor to add the following lines to ./profile and don't forget to do source ./profile for your active terminal session.

export PATH=/home/user/sphinx/local/bin:$PATH
export LD_LIBRARY_PATH=/home/baybars/scripts/sphinx/local/lib:$LD_LIBRARY_PATH
export PKG_CONFIG_PATH=/home/baybars/scripts/sphinx/local/pkgconfig:$PKG_CONFIG_PATH

Mac OSX

There is an easy solution for downloading and compiling the code-base for Mac OSX. All of the compilation tasks explained above are encapsulated in a custom brew. For an in depth explanation and the scripts themselves, please go to the repository of the user watsonbox.


Adaptation

One of the more interesting functionalities of speech tools is the ability to create specific models for specific situations or speakers. Starting from a generic acoustic model, we can have an adaptation for a specific user voice, a sound condition (like a consistently noisy background), or a specific recording hardware. You can find a very in depth tutorial within the CMUSphinx webpage, and you can execute them easily using our models and additional resources which includes audio files for training, their transcripts, and the path reference files, ready for execution.

In order to follow the steps of the tutorial you will need these executables from your CMUSphinx library: sphinx_fe for feature extraction; pocketsphinx_mdef_convert in case your mdef (mixture weights) file is in binary format; and bw to collect statistics from the adaptation data.

The actual adaptation tools are mllr_solve and map_adapt which refer to the methods, Maximum Likelihood Linear Regression (MLLR) and Maximum a posteriori Probability (MAP). MLLR creates a transformation file that interacts with the decoding process during run-time, whereas MAP actually modifies the parameters in the acoustic model. Quoting the CMUSphinx tutorial:

"MLLR is a cheap adaptation method that is suitable when the amount of data is limited. It’s a good idea to use MLLR for online adaptation. MLLR works best for a continuous model. Its effect for semi-continuous models is very limited since semi-continuous models mostly rely on mixture weights. If you want the best accuracy you can combine MLLR adaptation with MAP adaptation below. On the other hand, because MAP requires a lot of adaptation data it is not really practical to use it for continuous models. For continuous models MLLR is more reasonable."