Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model merging callback #2241

Open
lewtun opened this issue Oct 16, 2024 · 2 comments
Open

Add model merging callback #2241

lewtun opened this issue Oct 16, 2024 · 2 comments
Labels
✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity

Comments

@lewtun
Copy link
Member

lewtun commented Oct 16, 2024

Feature request

Add a MergeModelCallback that merges the reference model with the current policy and optionally pushes the merged checkpoint to the Hub. This could be done on step/epoch end and/or the end of training. Implementation-wise, we could use Arcee's mergekit lib and include it as an optional dependency: https://github.com/arcee-ai/mergekit

Motivation

Various papers show that model merging can non-trivially improve performance, especially if the models belong to the same architecture:

Your contribution

Open to the community!

@lewtun lewtun added the ✨ enhancement New feature or request label Oct 16, 2024
@qgallouedec qgallouedec added the 🧒 good second issue Good for contributors with basic project familiarity label Oct 16, 2024
@coding-famer
Copy link

I'm interested in working on this!

@qgallouedec
Copy link
Member

Nice! Thanks @coding-famer. Feel free to open a PR then and request any help if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity
Projects
None yet
Development

No branches or pull requests

3 participants