Audio-LRP

Feature extraction

Spectrogram is a standard sound visualization tool, showing the distribution of energy in both time and frequency.

Mel-spectrogram

Spectrogram with the Mel Scale as its y axis, Mel Scale is constructed such that sounds of equal distance from each other on the Mel Scale, also “sound” to humans as they are equal in distance from one another.

Constant-Q spectrogram

Spectrogram of the constant-q transform, it has geometrically spaced center frequencies, it also increases time resolution towards higher frequencies, as the human auditory system.

Model training

Model trained on the following net:

  _________________________________________________________________
  Layer (type)                 Output Shape              Param #
  =================================================================
  conv2d_1 (Conv2D)            (None, 62, 62, 32)        896
  _________________________________________________________________
  conv2d_2 (Conv2D)            (None, 60, 60, 64)        18496
  _________________________________________________________________
  max_pooling2d_1 (MaxPooling2 (None, 30, 30, 64)        0
  _________________________________________________________________
  dropout_1 (Dropout)          (None, 30, 30, 64)        0
  _________________________________________________________________
  conv2d_3 (Conv2D)            (None, 28, 28, 64)        36928
  _________________________________________________________________
  conv2d_4 (Conv2D)            (None, 26, 26, 64)        36928
  _________________________________________________________________
  max_pooling2d_2 (MaxPooling2 (None, 13, 13, 64)        0
  _________________________________________________________________
  dropout_2 (Dropout)          (None, 13, 13, 64)        0
  _________________________________________________________________
  conv2d_5 (Conv2D)            (None, 11, 11, 128)       73856
  _________________________________________________________________
  conv2d_6 (Conv2D)            (None, 9, 9, 128)         147584
  _________________________________________________________________
  max_pooling2d_3 (MaxPooling2 (None, 5, 5, 128)         0
  _________________________________________________________________
  dropout_3 (Dropout)          (None, 5, 5, 128)         0
  _________________________________________________________________
  flatten_1 (Flatten)          (None, 3200)              0
  _________________________________________________________________
  dense_512 (Dense)            (None, 512)               1638912
  _________________________________________________________________
  dropout_4 (Dropout)          (None, 512)               0
  _________________________________________________________________
  dense_out (Dense)            (None, 10)                5130
  =================================================================
  Total params: 1,958,730
  Trainable params: 1,958,730
  Non-trainable params: 0
  _________________________________________________________________

Layer relevance propagation

LRP-epsilon

LRP-flat

How to use

Clone the repository
cd Audio-LRP
docker build -t lrp-pytf .
docker run -it --rm lrp-pytf
Once inside the new shell, run the scripts
To tranfer the data generated by the scripts, open a new terminal window

docker ps

CONTAINER-ID

9b81f107614b

docker cp <CONTAINER-ID>:./src/. ./dst

Dataset

We are working with the UrbanSound8k dataset. The dataset has been pre-processed to make it uniform in duration and sampling frequency.

In this folder the feature extraction process has already been done, you can find the extracted mel-spectrogram inside urban/train, the feature are than reorganized into specific folders for the analysis part.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
images		images
models		models
urban		urban
.dockerignore		.dockerignore
Dockerfile		Dockerfile
README.md		README.md
analyze.py		analyze.py
constant-q_example.png		constant-q_example.png
feature_extraction.py		feature_extraction.py
final_image_generator.py		final_image_generator.py
mean.py		mean.py
mel_example.png		mel_example.png
result_organizer.py		result_organizer.py
test_organizer.py		test_organizer.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio-LRP

Feature extraction

Mel-spectrogram

Constant-Q spectrogram

Model training

Layer relevance propagation

LRP-epsilon

LRP-flat

How to use

Dataset

About

Releases

Packages

Languages

warpcut/Audio-LRP

Folders and files

Latest commit

History

Repository files navigation

Audio-LRP

Feature extraction

Mel-spectrogram

Constant-Q spectrogram

Model training

Layer relevance propagation

LRP-epsilon

LRP-flat

How to use

Dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages