A example showing how to get CTC (connectionist temporal classification) cost function working with Tensorflow for automatic speech recognition.
- python 2.7+
- tensorflow 1.0+
- python_speech_features
- numpy
- scipy
- sox (to convert MP3 to WAV)
I'm trying to transcribe recitation of the Quran from various reciters. The verse-by-verse recitation can be downloaded here. Convert them into WAV format using 2wav.sh
script. Some WAV files from surah Al-Fatihah verse 2 are included in the wav
directory to get started.
Some useful introductory materials to get started:
- Deep Learning for Speech Recognition (video)
- CTC + Tensorflow + TIMIT
- Machine Learning is Fun Part 6
This project is licensed under the terms of the MIT license.
See README for more information.