A deep learning-based project to classify urban sounds into distinct categories using the UrbanSound8K dataset.
- Python: Programming language.
- NumPy: For numerical computations.
- Pandas: For handling dataset metadata.
- Librosa: For audio signal processing and feature extraction.
- Matplotlib: For visualizations.
- TensorFlow/Keras: For building and training the CNN model.
The dataset used in this project is UrbanSound8K, which contains 8,732 labeled sound excerpts (≤4 seconds) of urban sounds belonging to 10 classes:
- air_conditioner
- car_horn
- children_playing
- dog_bark
- drilling
- engine_idling
- gun_shot
- jackhammer
- siren
- street_music
This file provides detailed information about each audio file in the dataset, including:
- slice_file_name: The audio file name in the format
[fsID]-[classID]-[occurrenceID]-[sliceID].wav
- [fsID]: Freesound ID of the recording.
- [classID]: Numeric identifier of the sound class (e.g., 0 = air_conditioner).
- [occurrenceID]: Distinguishes occurrences of the sound in the original recording.
- [sliceID]: Distinguishes slices from the same occurrence.
- fsID: Freesound ID of the recording.
- start / end: Start and end time of the slice in the original recording.
- salience: A subjective rating of the sound’s prominence (1 = foreground, 2 = background).
- fold: Fold number (1–10) for cross-validation.
- classID: Numeric identifier of the sound class (e.g., 0 = air_conditioner).
- class: Human-readable class name (e.g., "air_conditioner").
This study aims to classify urban sounds into the aforementioned categories using a Convolutional Neural Network (CNN) model.
-
preprocessing.ipynb
- Performs preprocessing steps such as:
- Audio file loading.
- Feature extraction (e.g., Mel-spectrograms).
- Dataset preparation for model training.
- Performs preprocessing steps such as:
-
model_preparation_and_training.ipynb
- Builds and trains a CNN model on the processed data.
- Evaluates model performance using metrics such as loss and accuracy.
Model | Loss | Accuracy |
---|---|---|
Model 1 | 0.4307 | 86.87% |
Model 2 | 0.3629 | 89.27% |
Model 3 | 0.4355 | 88.01% |
git clone https://github.com/your-username/audio-classification-project.git
cd audio-classification-project
Ensure you have Python and the necessary libraries installed. You can install the dependencies using:
pip install numpy pandas librosa matplotlib tensorflow keras
Run the preprocessing.ipynb notebook to preprocess the dataset and extract features.
jupyter notebook preprocessing.ipynb
Run the model_preparation_and_training.ipynb notebook to train and evaluate the CNN model on the processed data.
jupyter notebook model_preparation_and_training.ipynb