This is reimplementation of the NSynth model as described in Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders [arxiv:1704.01279].
The original TensorFlow v1 code can be found under in github:tensorflow/magenta.
Python package requirements for this code are torch >= 1.3.1
and librosa>=0.7.1
.
Further to load audio with librosa you need libsndfile
which most systems should already have.
To replicate the original experiment you will have to download the NSynth dataset under https://magenta.tensorflow.org/datasets/nsynth#files in json/wav
format.
A training script can be found in train.py
which will train the model with the same parameters as in the original paper.
python ./train.py --datadir SOME_PATH/nsynth/ --device=cuda:0 --nbatch=32
The required argument datadir should be de full path to the NSynth dataset. (The path should contain folders nsynth-test
, nsynth-train
and nsynth-valid
each of whom have to contain the folder /audio/
and the file examples.json
.)
The other default arguments will give the same setting as in the original paper.