You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question about the structure of quantized ONNX models.
When we quantize an ONNX model to int8 or uint8 using static quantization, is it guaranteed that the first layer of the quantized model will always be a QuantizeLinear node? Or does this depend on the specific quantization method or the tool used for the quantization process?
I’m trying to understand whether this is a standardized behavior for models quantized with static quantization or if it varies based on implementation details.
Any insights, explanations, or references to relevant documentation would be greatly appreciated.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have a question about the structure of quantized ONNX models.
When we quantize an ONNX model to
int8
oruint8
using static quantization, is it guaranteed that the first layer of the quantized model will always be aQuantizeLinear
node? Or does this depend on the specific quantization method or the tool used for the quantization process?I’m trying to understand whether this is a standardized behavior for models quantized with static quantization or if it varies based on implementation details.
Any insights, explanations, or references to relevant documentation would be greatly appreciated.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions