Real-time Speaker Recognition

This repository contains algorithms for real-time speaker recognition applications. It is implemented using either Gaussian Mixture Model or Convolutional Neural Network. For the GMM part, a dynamic threshold can be used to improve the recognition efficiency, but sharply increases the training time.

Usage (GMM)

Enroll wav files into a model.out and then launch the python script RTSP.py:

cd ./GMM
python3 speaker_recognition.py -t enroll -i ./path/to/wav_files_folder/* -m ./your-output-models/model.out
python3 RTSP.py

A prediction is made every three seconds once the model is loaded, for 15 seconds in total. You can modify the duration by changing the while loop, line 103 (tmp < 5).

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
CNN		CNN
GMM		GMM
Pre-process		Pre-process
__pycache__		__pycache__
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
acquisition.py		acquisition.py
buffer.py		buffer.py
evaluation.py		evaluation.py
requirements.txt		requirements.txt
utils.py		utils.py
vad.py		vad.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-time Speaker Recognition

Usage (GMM)

Contributing

About

Releases

Packages

Contributors 2

Languages

License

aturkelson/real-time-speaker-recognition

Folders and files

Latest commit

History

Repository files navigation

Real-time Speaker Recognition

Usage (GMM)

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages