tones2notes

Automatic Music Transcription (AMT) refers to the task of transcribing a given audio into symbolic representations (musical notes or MIDI). In this project, the goal is to transcribe musical recordings into music note events with pitch, onset, offset, and velocity. It is a challenging task due to the high polyphony of music pieces and requires appropriate data processing for audio files. We have implemented and evaluated Deep Learning models for music transcription. The architectural design of models and data processing techniques are based on this paper.

Running Instructions

The dataset used is MAPS, which can be downloaded from here. After downloading it, store it in data/MAPS
Install the required python packages by
```
pip install -r requirements.txt 
```
Loading the dataset, splitting it and storing in .h5 binaries -
```
python3 features.py --dir data/MAPS --workspace $(pwd)
```
Training the model (includes both processing features and training)
```
python3 src/main.py train --model_type='CRNN_Conditioning' --loss_type='regress_onset_offset_frame_velocity_bce' --batch_size=8 --max_note_shift=0 --learning_rate=5e-4 --reduce_iteration=10000 --resume_iteration=0 --early_stop=50000 --workspace=$(pwd) --cuda
```
We have implemented 3 models, choose the model_type among ['CRNN', 'CCNN', 'CRNN_Conditioning']. Also, there are 2 loss functions available (regressed and non-regressed). Refer to the comments in run.sh for more info. The trained model will be stored at checkpoints in checkpoints folder with training stats in statistics folder

Infering the output probabilities on Test dataset and storing them in probs folder

python3 src/results.py infer_prob --model_type='CRNN_Conditioning' --checkpoint_path=$CHECKPOINT_PATH --dataset='maps' --split='test' --post_processor_type='regression'  --workspace=$WORKSPACE --cuda

Evaluating the Test dataset

python3 src/results.py calculate_metrics --model_type='CRNN_Conditioning' --dataset='maps' --split='test' --post_processor_type='regression' --workspace=$WORKSPACE

Also, there are some result plots present in notebooks/plots.ipynb and piano roll with MIDI notes of a transcripted audio present in transcription_plots.ipynb

For Transcribing a given Audio

python3 src/transcribe_and_play.py --audio_file <name of audio file>

It will transcribe the given audio using the best checkpoint model into MIDI, generate the MIDI file and also generate a video using synthviz library corresponding to the MIDI, displaying the notes played. Note that transcription requires ffmpeg backend and therefore does not work on gpu1.cse.iitb.ac.in, unless you install it with sudo permissions

Transcription Results

Piano Roll Comparison for an audio from MAPS Test dataset
Piano Roll of L theme (Death Note)
Fur elise. The original music is this

fur_elise_transcripted.mp4
L theme (Death Note). The Original music is this

L_original_transcripted.mp4
Nezuko Theme (Demon Slayer). The original music is this

nezuko_transcripted.mp4
A musical piece from Aajkal tere mere pyar ke charche in Accordion. The original audio is this

aajkal_transcripted.mp4
Nagin. Notice that there is a lot of noise due to multiple instruments being played together (polyphonic music)

Nagin_transcripted.mp4

References

Qiuqiang Kong, Bochen Li, Xuchen Song, Yuan Wan, and Yuxuan Wang. ”High-resolution Piano Transcription with Pedals by Regressing Onsets and Offsets Times.” arXiv preprint arXiv:2010.01815 (2020).
bytedance and kong's repositories for data processing technique's and model architecture
Valentin Emiya, Nancy Bertin, Bertrand David, Roland Badeau. MAPS - A piano database for multipitch estimation and automatic transcription of music
This repository for information about datasets and understanding transcription pipeline

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
data		data
model_checkpoints		model_checkpoints
notebooks		notebooks
results		results
samples		samples
src		src
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh
tones2notes_presentation.pdf		tones2notes_presentation.pdf
tones2notes_report.pdf		tones2notes_report.pdf
transcription_plots.ipynb		transcription_plots.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tones2notes

Running Instructions

For Transcribing a given Audio

Transcription Results

References

About

Releases

Packages

Contributors 4

Languages

Atishay25/tones2notes

Folders and files

Latest commit

History

Repository files navigation

tones2notes

Running Instructions

For Transcribing a given Audio

Transcription Results

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages