A handy dataset for noise augmentations for ASR / TTS:
- ~20k noise files;
- ~200 distinct categories;
Contact us! Open issues, collaborate, submit a PR, contribute, share your datasets!
Add much more data from BBC Sound Effects dataset.
Meta data file / 2.0M / 73cb528656a484b20e02d6c5fd05f14c
Noise archive file / 4.7G / 5e069c867a0da891f57616905129b6c3
Open feather file:
import pandas as pd
df = pd.read_feather(file_path)
The dataset is compiled using open domain sources. All labels resembling loud human speech were removed (but background noise, i.e. street chatter, was not removed). All of the items are 0 - 60 seconds long.
All files are normalized as follows:
- Converted to mono, if necessary;
- Converted to 16 kHz sampling rate, if necessary;
- Stored as 16-bit integers;
Please contact us here or just create a GitHub issue!
cc-by
Links / license
- rnnoise / CC0;
- acoustic events /
if you end up using the dataset, we ask you to cite the following paper
; - urban sounds / cc-by-nc;
- esc-50 / license (cc-by-nc);
- freiburg-106 / ?;
- sound-events / ?;
- BBC Sound Effects (a small part) / license;
- nar dataset /
the data are freely accessible for scientific research purposes and for non-commercial applications
Paper citations:
- Naoya Takahashi, Michael Gygli, Beat Pfister and Luc Van Gool,"Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition", Proc. Interspeech 2016, San Fransisco;
- J. Salamon, C. Jacoby and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research", 22nd ACM International Conference on Multimedia, Orlando USA, Nov. 2014;
Donate (each coffee pays for several full downloads) / use our DO referral link to help.