Media Lexicometer

This repository contains a Python Django application and some scripts to:

capture the audio streams from DVB-T channels thanks to DVB-T adapters,
perform live speech to text on the channel audio streams,
record the recognized words to a database,
allow an user to query the database from a Web page and generate graphs as results.

Setup

Install the dependancies

sudo apt install ffmpeg w-scan dvb-tools virtualenv tmux

Create the channel file

To scan the channels available in your location:

w_scan -f t -c FR -X > channels.conf

To convert the channels.conf file to the v5 format:

dvb-format-convert -I ZAP -O DVBV5 -s dvb-t channels.conf channels_v5.conf

Create and setup your Python virtual environment

virtualenv venv
source venv/bin/activate
pip install -r requirements.txt

Get the models

Download the spacy model from here and decompress it in the main application folder.

Download the Vosk model from here and decompress the content of the vosk-model-fr-0.6-linto directory in the archive in a model directory in the main application folder.

Add the channels to the database

cd mediaAnalysis
python3 manage.py syncdb
python3 manage.py createsuperuser
python3 manage.py runserver

Go to the Django administration interface and log in with the admin credentials you've just created. Then, create your channel records.

Create your channel scripts

Look at the example scripts in the script directory to create your own channel capture and analysis scripts. The example scripts use two tmux sessions per multiplex to start dvbv5-zap and the liveSpeechToText.py script. You are of course free to use a more appropriate way to run them in the background.

Usage

Per multiplex, you must run a dvbv5-zap instance and a liveSpeechToText.py instance (see previous section). To run the query Web interface, use:

python3 manage.py runserver

Then, go to http://localhost:8000.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
mediaAnalysis		mediaAnalysis
scripts		scripts
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Media Lexicometer

Setup

Install the dependancies

Create the channel file

Create and setup your Python virtual environment

Get the models

Add the channels to the database

Create your channel scripts

Usage

About

Releases

Packages

Languages

License

magwyz/mediaLexicometer

Folders and files

Latest commit

History

Repository files navigation

Media Lexicometer

Setup

Install the dependancies

Create the channel file

Create and setup your Python virtual environment

Get the models

Add the channels to the database

Create your channel scripts

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages