Project 3 Generative Audio

Abstract

Which bird songs are we used to hearing? Which may we be hearing less and less of? And which will we never hear again, save in rare recordings? Perhaps we can experience the past, present, and soon-to-be-gone future as a way to reflect on our impact on birds.

The result of this art project is a latent space interpolation between bird songs from species that are Common, Endangered, and now Extinct.

I hand selected two songs each from common and endangered species, and three songs from extinct species. The chosen birds from each category were as follows:

Common - Mourning Dove, American Robin
Endangered - Snowy Plover, Spectacled Elder
Extinct - Kauai O'o, Huia, ???*

I encoded my training samples as Mel spectrograms, then built and trained a convlutional autoencoder on the spectrograms with a 8x compression factor. After choosing desired birdsongs, I created three interpolated frames between each of these encoded bird samples in the latent space, and then decoded the interpolated frames and applied a sharpening filter. Finally, I converted the concatenation of these frames back to a spectrogram and to an audio file.

Model/Data

Training data: from Nanni et al. 2016 bird songs dataset is not included in this repository. It is a ~10GB collection of 2814 bird song syllables from 46 different species of birds in southern Brazil. Note that it appears all syllables have been looped until the audio is at least 1min 20s long. I converted each clip into 3-second splits, and converted each split into a spectrogram (saved as a numpy array). This produced a total of 79,377 spectrogram samples.
Convolutional Autoencoder weights: model*.h5, trained up to 90 epochs on a training data set consisting of 5000 random samples of the above training set
Testing/chosen interpolation samples: manually chosensee References

Code

See proj2.ipynb.

Results

See output2.wav, spectro.png.

Both are representations of the final interpolated audio.

Technical Notes

Requires ffmpeg and librosa for working with audio and spectrograms.

References

Papers/Articles:

Mangalam Sankupellay and Dmitry Konovalov primarily to discover the dataset
Good Audience Music Autoencoder for ideas to use with audio autoencoding
Red Buffer Autoencoder Guide, Tensorflow CVAE Tutorial for help with CAE layer architecture
The Most Common Birds of North America for a list of common bird species
Threatened Birds of North America for a list of endangered bird species
Sounds of Extinct Birds for a list and links to samples of now-extinct bird species (global)

Datasets:

Nanni et al. 2016 bird songs dataset for training samples
https://www.xeno-canto.org/ for chosen birdsong samples: Mourning Dove, American Robin, Snowy Plover
https://www.macaulaylibrary.org/ for chosen birdsong samples: Spectacled Elder, Kauai O'o
https://teara.govt.nz/ for chosen birdsong samples: Huia
??? for chosen birdsong samples: * [Note that * is 1 of 15 bird species that have gone extinct in the last century and don't have any preserved recordings.]

Other ideas/art projects involving ML and birdsongs that I stumbled upon while working:

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
downloaded		downloaded
spectrograms/splits		spectrograms/splits
.gitignore		.gitignore
Exploration.ipynb		Exploration.ipynb
README.md		README.md
model.h5		model.h5
model40ep.h5		model40ep.h5
model90ep.h5		model90ep.h5
output.wav		output.wav
output2.wav		output2.wav
proj2.ipynb		proj2.ipynb
spectro.png		spectro.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 3 Generative Audio

Abstract

Model/Data

Code

Results

Technical Notes

References

About

Releases

Packages

Contributors 2

Languages

ucsd-ml-arts/generative-audio-parker-audio

Folders and files

Latest commit

History

Repository files navigation

Project 3 Generative Audio

Abstract

Model/Data

Code

Results

Technical Notes

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages