-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AdEMAMix Optimizer #20258
Comments
Thanks for the suggestion. I see the paper has a total of 0 citations listed on ArXiv. As a general rule we wait to see >50 citations before including a technique in Keras. As per API guidelines: "We only add new objects that are already commonly used in the machine learning community" |
Now, if you want to build this optimizer, you can do so in your own repo, and then we can share it with the community to see if people adopt it. If eventually the optimizer becomes commonly used, we will add it to the Keras API. |
here i have implemented the optimizer using keras : https://github.com/IMvision12/AdEMAMix-Optimizer-Keras |
If we need to add this optimizer in the future, I'd be eager to integrate it into Keras. |
AdEMAMix integrates Adam and EMA optimization methods to tackle issues of slow convergence and subpar generalization in large language models and noisy datasets. It utilizes three beta parameters along with an alpha parameter to provide flexible momentum and adaptive learning rates.
Paper: https://arxiv.org/abs/2409.03137
I'm interested in adding this optimizer to Keras.
@fchollet
The text was updated successfully, but these errors were encountered: