speechRecognitionTranscriber
module use Google Speech API
in python
. This module performs speech recognition and converts to text. Admits video and audio files to be transcribed. Use ffmpeg
to convert video and audio files to .wav
to be recognized in Google Speech API
. Also use fragments division based on silence.
Documentation available on docs.
speechRecognitionTranscriber
requires video/audio like input.
The process to running the program:
- Execute programs/speechRecognitionTranscriber.py, to start de program.
python speechRecognitionTranscriber.py
- Introduce your file path.
yourfile.extension
NOTE:
- Transcribed text is saved in
transcribedText.txt
. - Transcribed text is saved in
transcribedText.pdf
. - Audio fragments are saved in
/fragments
. - Converted source is saved as
convertedFile.wav
.
Temporal files like cconvertedFile.wav
and /fragments
are deleted when program ends.
speechRecognitionTranscriber
requires:
- Install pip
- Install SpeechRecognition:
pip install SpeechRecognition
- Install fpdf:
pip install fpdf
- Install pydub
pip install pydub
- Install ffmpeg
Linux
sudo apt-get install ffmpeg
Microsoft Windows Download binaries and set path in system variables.
Tested on: windows 10
,ubuntu 14.04
, ubuntu 16.04
, ubuntu 18.04
, lubuntu 18.04
and raspbian
.