You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have done my due diligence in trying to find the answer myself.
Topic
The paper
Question
Thanks for the great work. I'm trying to reproduce Mimi and had the following questions:
Does Mimi use a loss balancer such as that used in Encodec for training? The paper points to the default Encodec configuration in AudioCraft which uses loss balancing, so I was wondering if that's the case for Mimi as well.
Was Mimi trained in bfloat16? Or did the actual training happen in full precision and the weights were exported in bfloat16?
Thanks!
The text was updated successfully, but these errors were encountered:
Hey @SarthakYadav 👋 , it seems the training code for Mimi has not yet been released, but they plan to do so in the near future, as mentioned in their FAQ section. However, here's what I could gather from the current resources:
In-their readme.md page, they state :
Finally, and similarly to EBEN, Mimi uses only an adversarial training loss, along with feature matching, showing strong improvements in terms of subjective quality despite its low bitrate.
so, i personally don't think they've used a loss balancer.
As for Q2, I think only the official team could answer as they've not released the training code yet!
2. Was Mimi trained in bfloat16? Or did the actual training happen in full precision and the weights were exported in bfloat16?
Which weights are you referring to? Looking at the model.safetensors file on our huggingface repo, the weights should actually be in fp32 rather than bf16 (and this should be the case too in our other repos).
Due diligence
Topic
The paper
Question
Thanks for the great work. I'm trying to reproduce Mimi and had the following questions:
Thanks!
The text was updated successfully, but these errors were encountered: