You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been following the system described in this paper by Y. Jia, et al Link. So far, I am done training the synthesizer module using ESPnet-Tacotron 2 multi-speaker tts scripts provided here: Link. I finished the training and resulted to intelligible speech, albeit robotic, using Griffin-Lim.
Now, in order to improve the synthesized outputs, I decided to train a wavenet vocoder using the synthesized mel-spectrograms (produced mel-specs of the train set) as described in the paper. I trained the model for 1000k steps and checked the output which resulted to garbled speech. I then extended the training (without changing the hparams) to 1600k steps but still no improvements. Sample synthesized audio files (and the hparams file) can be found here: Link.
Any help or insights on how I could continue would be very much appreciated. Thanks!
The text was updated successfully, but these errors were encountered:
Hello!
I have been following the system described in this paper by Y. Jia, et al Link. So far, I am done training the synthesizer module using ESPnet-Tacotron 2 multi-speaker tts scripts provided here: Link. I finished the training and resulted to intelligible speech, albeit robotic, using Griffin-Lim.
Now, in order to improve the synthesized outputs, I decided to train a wavenet vocoder using the synthesized mel-spectrograms (produced mel-specs of the train set) as described in the paper. I trained the model for 1000k steps and checked the output which resulted to garbled speech. I then extended the training (without changing the hparams) to 1600k steps but still no improvements. Sample synthesized audio files (and the hparams file) can be found here: Link.
Any help or insights on how I could continue would be very much appreciated. Thanks!
The text was updated successfully, but these errors were encountered: