-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tricks to prepare the training dataset #5
Comments
Hi, I think I can try to guess some answers to your quieries:
Anyway, it seems a bit too ambitious to me to train this model in a multi-instrument dataset. Also considering that this model is quite small (around 1M params), it would be hard for it to generete such diverse music. Did you get any interesting result?
|
@eloimoliner
2."It would be possible to extend the length of the segments, but 1-2 minutes is too much, you will run out of memory. What you should I guess is process small chunks and then concatenate them somehow. with overlap and add for example" . |
@adagio715
|
@Seungwoo0326 @eloimoliner @psp0001060 |
Hi, and here is the wandb report, where you can also listen to the audio examples. NUWave2 seemed to learn well where to add energy, but it seemed to fail at generating harmonics. I must say this is not a completely fair comparison, as I was not using the same diffusion parameterization as in the paper, my diffusion model is based on this: https://arxiv.org/abs/2206.00364, which I know that works quite well for music generation (I will publish the results in a couple of weeks). My experiment was only about the NUWave2 architecture (and the conditioning) . I want to believe that I did something wrong, but I cannot find any mistake. Anyway, it is a pity I can't include this model as a baseline. Also, I was working at 22050 kHz sampling rate. |
@adagio715 I don't think our checkpoint is wrong. Can I know the command you used for inference? I think you may be able to make some mistakes from the inference command. |
@eloimoliner I will be glad if you continue to share the result and continue discussing them together! |
@Seungwoo0326
The other arguments were left as default as in the |
@adagio715
I think it can be still confusing but hope this explanation helps you understand. I will correct the README soon. The case of 16kHz -> 48kHz |
@Seungwoo0326 |
@adagio715 |
Hi, the other day I tried experimenting with models and training The generated music may have poor quality if the segment length is too short, as it may not capture the full musical structure. However, it can still be useful for research purposes. |
The inference schedule in hparameters.yaml is set based on the authors' experience and the characteristics of the model. The specific numbers in the example are just a suggestion and can be adjusted based on the specific use case. |
Hello,
I'm very interested in your great work! I have 3 questions, would you mind helping me with them?
infer_step
as 8 with a specificinfer_schedule
. Is 8 the best parameter in your experiments? If we want to test different infer_step, how should we set the infer_schedule?Thank you very much for your help in advance!
The text was updated successfully, but these errors were encountered: