We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It seems that have some specific initialization strategy for initialization of Mamba:
zoology/zoology/mixers/mamba.py
Lines 75 to 94 in fcc6af7
However, you re-initialize the weights, when instantiating the Mamba based language model:
zoology/zoology/model.py
Line 156 in fcc6af7
And the _init_weights function doesn't skip the modules that have _no_reinit=True.
_init_weights
_no_reinit=True
Lines 73 to 97 in fcc6af7
If this is true, the fix should be checking the _no_reinit in the _init_weights function, similar to the Original Mamba implementation: https://github.com/state-spaces/mamba/blob/442fab4b1fd5226c1b5939b37d91ede430b5d1ae/mamba_ssm/models/mixer_seq_simple.py#L93-L96
_no_reinit
The text was updated successfully, but these errors were encountered:
No branches or pull requests
It seems that have some specific initialization strategy for initialization of Mamba:
zoology/zoology/mixers/mamba.py
Lines 75 to 94 in fcc6af7
However, you re-initialize the weights, when instantiating the Mamba based language model:
zoology/zoology/model.py
Line 156 in fcc6af7
And the
_init_weights
function doesn't skip the modules that have_no_reinit=True
.zoology/zoology/model.py
Lines 73 to 97 in fcc6af7
If this is true, the fix should be checking the
_no_reinit
in the_init_weights
function, similar to the Original Mamba implementation:https://github.com/state-spaces/mamba/blob/442fab4b1fd5226c1b5939b37d91ede430b5d1ae/mamba_ssm/models/mixer_seq_simple.py#L93-L96
The text was updated successfully, but these errors were encountered: