Training of hidden Markov models as an instance of the expectation maximization algorithm
This repository contains my bachelor thesis. All content is available under CC BY-ND 4.0 International, except for the CC BY-ND license mark, which is licensed under CC BY 4.0 International, and is a trademark of Creative Commons.
The PDF of the submitted version is here. There are also some slides from a presentation of my work, but they are not nearly as self-explanatory since I'm supposed to be talking over them.
In Natural Language Processing (NLP), speech and text are parsed and generated with language models and parser models, and translated with translation models. Each model contains a set of numerical parameters which are found by applying a suitable training algorithm to a set of training data.
Many such training algorithms are instances of the Expectation-Maximization (EM) algorithm. In [BSV15], a generic EM algorithm for NLP is described. In this work, I present a particular speech model, the Hidden Markov model, and its standard training algorithm, the Baum-Welch algorithm. I show that the Baum-Welch algorithm is an instance of the generic EM algorithm introduced by [BSV15], from which follows that all statements about the generic EM algorithm also apply to the Baum-Welch algorithm, esp.&nsbp;its correctness and convergence properties.