-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mamba update #254
base: main
Are you sure you want to change the base?
Mamba update #254
Conversation
10ba930
to
6277b33
Compare
High level comment that these changes are up to date as of mamba-ssm 1.2.1 where I have been able to successfully run the code (however gradient_checkpointing does not work). There are now more features with mamba-ssm>=2.0.0, and submodules have been renamed, so there are some changes to be made to make it compatible. |
I had the same problem with gradient checkpointing and it prevented some interesting experiments about long-context mamba... I didn’t have time to look into this yet but maybe updating to Mamba2 first would make sense. |
… dict based on names)
… dict based on names)
… dict based on names)
Mamba is now using a class for its input, this updates OpenLM accordingly.