We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FlashLlamaForCausalLM
dense
Hi there! 🤗
FlashLlamaForCausalLM uses name dense for its MLP submodule and when user wants to employ a LoRA adapter, get_mlp_weights skips this submodule.
get_mlp_weights
text-generation-inference/server/text_generation_server/models/custom_modeling/flash_llama_modeling.py
Line 440 in 6e32205
text-generation-inference/server/text_generation_server/utils/adapter.py
Lines 259 to 261 in 6e32205
This causes error:
[rank0]: KeyError: (0, 'gate_proj')
This is not the case for FlashGemma2ForCausalLM, for example, and it works properly. When I renamed dense to mlp , llama worked as well.
FlashGemma2ForCausalLM
mlp
The text was updated successfully, but these errors were encountered:
I faced the same error. It succesfully works on release 2.3.0, but fails on 2.3.1, 2.4.0.
Sorry, something went wrong.
No branches or pull requests
Hi there! 🤗
FlashLlamaForCausalLM
uses namedense
for its MLP submodule and when user wants to employ a LoRA adapter,get_mlp_weights
skips this submodule.text-generation-inference/server/text_generation_server/models/custom_modeling/flash_llama_modeling.py
Line 440 in 6e32205
text-generation-inference/server/text_generation_server/utils/adapter.py
Lines 259 to 261 in 6e32205
This causes error:
This is not the case for
FlashGemma2ForCausalLM
, for example, and it works properly. When I renameddense
tomlp
, llama worked as well.The text was updated successfully, but these errors were encountered: