When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char #146

chenminjun-web · 2024-09-21T02:05:47Z

I fine-tune the Gemma2 2B Instruction with BitsAndBytes（int4). It works when test with the transformer.
Then I follow the guide to build the mllm and quantize the model for linux.
But when I test the finetuned model with the example demo_gemma, it always output a repeated char(Korea char).
Does any one tried this?
Or is something wrong with me?

chenghuaWang · 2024-09-21T13:05:56Z

The outputs from the gemma-q4_k and gemma-q4_0 models provided by the mllm team are correct. You can download the gemma model params from the repository at https://huggingface.co/mllmTeam/gemma-2b-mllm/tree/main to test whether your mllm has been compiled correctly.

Additionally, have you modified the vocabulary of Gemma when finetuning? If so, you will need to provide the correct vocabulary file to demo_gemma.

chenghuaWang · 2024-09-21T13:55:47Z

Have you combined the $\Delta W$ and $W_{\text{original}}$ into a new weight matrix $W$ ? As far as I know, the mllm model does not yet implement a low-rank branch. Therefore, you will need to amalgamate the weights from the low-rank component with those of the original model to create a unified weight matrix.

IIRC, some LoRA fine-tuning frameworks offer utilities to facilitate this process. For instance, the alpaca-lora framework might have relevant functions. It's also possible that BitsAndBytes provides similar functionality. You can find more information in the alpaca-lora repository, specifically in the file export_hf_checkpoint.py, around line 39: Link to GitHub.

chenminjun-web · 2024-09-24T00:19:57Z

Thanks for your help.
I downloaded the gemma-q4_k from https://huggingface.co/mllmTeam/gemma-2b-mllm/tree/main, and tested with the mllm build on my Untuntu 20.04, it works well.
But when I converted the gemma2 model to mllm myself, it always output a single char(like ? or .) repeatedly.
I also tested the Gemma2-2B-IT model without QLora fine-tuned, it also has the same problem.
Seems it may have some problem about the convert process, but I did not see any error log.

chenghuaWang · 2024-09-24T02:10:16Z

The Gemma impl in mllm is v1.1. Gemma 2 shares a similar architectural foundation with the original Gemma models. Compared to the original Gemma, Gemma2 introduces features such as Logit Soft-Capping. You need to modify the modeling_gemma.hpp file. Link to the file

cc @yirongjie pls add Gemma2 to our todo list

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char #146

When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char #146

chenminjun-web commented Sep 21, 2024

chenghuaWang commented Sep 21, 2024 •

edited

Loading

chenghuaWang commented Sep 21, 2024

chenminjun-web commented Sep 24, 2024

chenghuaWang commented Sep 24, 2024 •

edited

Loading

When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char #146

When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char #146

Comments

chenminjun-web commented Sep 21, 2024

chenghuaWang commented Sep 21, 2024 • edited Loading

chenghuaWang commented Sep 21, 2024

chenminjun-web commented Sep 24, 2024

chenghuaWang commented Sep 24, 2024 • edited Loading

chenghuaWang commented Sep 21, 2024 •

edited

Loading

chenghuaWang commented Sep 24, 2024 •

edited

Loading