Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AdEMAMix Optimizer #20258

Closed
IMvision12 opened this issue Sep 14, 2024 · 4 comments
Closed

Add AdEMAMix Optimizer #20258

IMvision12 opened this issue Sep 14, 2024 · 4 comments
Assignees

Comments

@IMvision12
Copy link
Contributor

AdEMAMix integrates Adam and EMA optimization methods to tackle issues of slow convergence and subpar generalization in large language models and noisy datasets. It utilizes three beta parameters along with an alpha parameter to provide flexible momentum and adaptive learning rates.

Paper: https://arxiv.org/abs/2409.03137

I'm interested in adding this optimizer to Keras.

@fchollet

@fchollet
Copy link
Member

Thanks for the suggestion. I see the paper has a total of 0 citations listed on ArXiv. As a general rule we wait to see >50 citations before including a technique in Keras. As per API guidelines: "We only add new objects that are already commonly used in the machine learning community"

@fchollet
Copy link
Member

Now, if you want to build this optimizer, you can do so in your own repo, and then we can share it with the community to see if people adopt it. If eventually the optimizer becomes commonly used, we will add it to the Keras API.

@IMvision12
Copy link
Contributor Author

@fchollet

here i have implemented the optimizer using keras : https://github.com/IMvision12/AdEMAMix-Optimizer-Keras

@IMvision12
Copy link
Contributor Author

If we need to add this optimizer in the future, I'd be eager to integrate it into Keras.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants