Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load decoder.lm_head.weight when loading 4 bit quantized model using VisionEncoderDecoder.from_pretrained #1343

Open
AditiJain14 opened this issue Aug 30, 2024 · 1 comment

Comments

@AditiJain14
Copy link

System Info

I am trying to load a finetuned and quantized to 4bit Donut model. While save_pretrained works fine, when I try to load the quantized model (at quant_path) as
model = VisionEncoderDecoderModel.from_pretrained(quant_path, load_in_4bit = True), it loads all of the parameters correct except decoder.lm_head.weight, which is instead reset. I am unable to find the cause of this issue, and it happens both when 1. I load the quantized model, or 2. when I load the finetuned checkpoint with load_in_4bit argument.

I have tried the same steps with 'naver-clova-ix/donut-base' model from huggingface and it works fine. Any help would be much appreciated!

Reproduction

from transformers import VisionEncoderDecoderModel
finetuned_model.save_pretrained(finetuned_path, safe_serialization = False) #Safe_serialisation = True discards lm_head.weight
model = VisionEncoderDecoderModel.from_pretrained(finetuned_path, load_in_4bit=True)

Expected behavior

The model is loaded with the decoder.lm_head.weight from the finetuned checkpoint

@AditiJain14
Copy link
Author

To add: on checking the model.safetensors saved using save_pretrained method, the state_dict does not contain "decoder.lm_head.weight".
Somehow, this does not create a problem when calling from_pretrained without load_4_bit argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant