You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, when I use BitsAndBytesConfig (with default parameters), the model architecture is modified and the number of parameters becomes half.
I expect that, if I pass no parameters (i.e. just with default choice), there should be same as normal case (no quantization, no modification of model architecture).
The text was updated successfully, but these errors were encountered:
System Info
Ubuntu 20.04
cuda 12.2.2
Python=3.11.9
transformers=4.44.2
bitandbytes=0.43.3
GPU: A800
Reproduction
Expected behavior
When I load the model without
BitsandBytes
, with code:it is normal like below:
However, when I use
BitsAndBytesConfig
(with default parameters), the model architecture is modified and the number of parameters becomes half.I expect that, if I pass no parameters (i.e. just with default choice), there should be same as normal case (no quantization, no modification of model architecture).
The text was updated successfully, but these errors were encountered: