This repository contains all relevant code and project files for the Marvin project, a trigger word detector able to recognize the word "Marvin" from a stream of audio data. Here you can find the final model implementations, all code used for training and tuning, evaluation metrics with visualizations, in addition to an interactive demonstration of the final models.
Link to video here.
Link to paper here.
All interactive Jupyter Notebooks are created in Google Colaboratory. For certain notebooks we have used functionality exclusive to this platform, which implies that notebooks must be run on Google Colaboratory for proper execution. Notebooks also clone this Git repository into the remote virtual instance such that scripts and other utilities defined in this repository can be invoked.
Every notebook also includes colab-exclusive markdown functionality. We therefore highly recommend navigating individual notebooks using the links below for increased readability, as these notebooks do not render properly in the GitHub preview.
notebooks/
analysis/
dataframes.ipynb
tensorboard.ipynb
training/
model_cnn.ipynb
model_naive.ipynb
model_rnn.ipynb
demo.ipynb
evaluation.ipynb
preprocessing.ipynb
This directory serves as a placeholder for the dataset, which is lazily loaded in parallel from this release by the notebooks prior to training or evaluation. All the different dataset formats are stored in a Google Cloud Storage Bucket, which can be found here.
Contains log files for the majority of tested model configurations. Exported models for all configurations can be found within the GCS bucket here.
Contains all relevant information regarding the final models. We have also exported and converted the models to the following formats for easier deployment:
- Python: SavedModel and Keras H5 format.
- JavaScript: GraphModel and LayersModel format for easier client-side web integration.
Contains all interactive Jupyter Notebooks listed above.
Contains metrics and figures from evaluation of the final models.
Utility scripts which defines resampling strategy, custom model layers, preprocessing scripts used by the interactive notebooks.
Various visualizations used in the paper or during analysis of model configurations.
One of the tools we used for analyzing model configurations was TensorBoard. An extract of model configurations can be found here:
- CNN: https://tensorboard.dev/experiment/NNAWh4qNQpq9Bqo8Sef44g/
- RNN: https://tensorboard.dev/experiment/iP1COCqzRpKdYMRJ1CgL3Q/
- Naïve RNN: https://tensorboard.dev/experiment/cvwXDGRPSGuE4Y9qHoNuEQ/
For the complete list of configurations, run the tensorboard.ipynb notebook as it fetches logs directly from remote cloud storage and launches TensorBoard.
An interactive demonstration of the final models can be found here.