-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to quantify google/vit-base-patch16-224 pytorch_model.bin to int8 type with neural-compressor #1612
Comments
Hi @yingmuying
If you want to use other quantization methods, please refer to examples. |
Hi,Kaihui 首先非常感谢您的回复。刚开始学习使用 neural-compressor 进行量化,有很多可能比较低级的问题。参照 neural-compressor/examples/onnxrt/image_recognition/beit/quantization/ptq_static 也跑通了默认流程,但是只要想尝试一点其他参数就会报错。看 https://intel.github.io/neural-compressor/latest/docs/source/quantization.html 介绍,onnx 和 pytorch 支持 symmetry quantization和asymmetric quantization,默认 ptq_static 支持的是 static asymmetric quantization,不知道怎么设置才能支持 symmetry quantization,很多参数意义也不太清楚,希望您指点帮助。谢谢!此致 敬礼yingmuying发自我的荣耀手机-------- 原始邮件 --------发件人: Kaihui-intel ***@***.***>日期: 2024年2月21日周三 13:13收件人: intel/neural-compressor ***@***.***>抄送: yingmuying ***@***.***>, Mention ***@***.***>主 题: Re: [intel/neural-compressor] How to quantify google/vit-base-patch16-224 pytorch_model.bin to int8 type with neural-compressor (Issue #1612)
Hi @yingmuying
Thanks for raising this issue.
You can use dynamic quantization for the model:
from neural_compressor.config import PostTrainingQuantConfig
from neural_compressor import quantization
config = PostTrainingQuantConfig(device='cpu', approach='dynamic', domain='auto')
q_model = quantization.fit(your_model, config)
If you want to use other quantization methods, please refer to examples.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @yingmuying , Thanks for your reply. About static asymmetric/asymmetric quantization, you can configure by setting
or match all layers by ".*":
more usage in specify-quantization-rules |
No description provided.
The text was updated successfully, but these errors were encountered: