For project in course DT2112 VT24, "Detecting and correcting speech errors produced on LibriVox audiobooks".
We used a wav2vec2 model and feed it with a 5-gram model provided to us by our supervisor. https://www.kaggle.com/code/julwan/dubliners-wav2vec2-xls-r-300m-timit-phoneme
https://www.kaggle.com/code/julwan/wer-dubliners
https://www.kaggle.com/code/julwan/distance-alignment-confusion-matrix-dubliners
Over characters and phones between the two different accents.
The report can be found on https://www.overleaf.com/read/qhtvzkrjthms#83b9f5
Thanks to our supervisor Jim O'Regan for helping us through the project and providing us with the materials necessary to work on this project.
- Julia Wang
- Kevin Wenström
- Peter Cady