-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Google Gemma Model #5562
Comments
I second this. We now have Exl2 quants and GGUF quants so we should have support for both in llamacpp-python as well as Exllama loaders https://huggingface.co/models?sort=trending&search=LoneStriker+%2F+gemma |
I'm still getting errors with GGUF quants of Gemma |
Same |
LoneStriker/gemma-2b-GGUF upd: working on dev branch |
Now we just need to add support for finetuned/merged gemma models which arent working. Follow the mulit-thread. And check out my model for debugging. Thread links: |
Same |
also needs support for qwen1.5 models |
any updates on this ? |
Wondering the same for Gemma-7B |
CodeGemma just out. Did anyone try already? |
This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment. |
Gemma2-2b-IT is out, and I'd love to try it. Any support for Gemma yet?
|
Use the transformers model loader. Gemma 2 27B loads and generates, just slow. I'm running dual 4090s. Roughly 90 seconds to generate and output a response.. Also, previous Gemma models load with the exLlamaV2_HF loader if anyone was curious. |
Description
There is a new model by google for text generation LLM called Gemma which is based on Gemini AI.
https://ai.google.dev/gemma
The models are present on huggingface: https://huggingface.co/google/gemma-7b-it/tree/main
It would be nice if the tool can be updated to support this new model
The text was updated successfully, but these errors were encountered: