Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP16 and FP32 shows 30% lower accuracy compared to INT8 for the ViT Example in ONNX_PTQ #106

Open
chjej202 opened this issue Nov 13, 2024 · 0 comments
Assignees

Comments

@chjej202
Copy link

chjej202 commented Nov 13, 2024

I followed the directions in the README.md file in the onnx_ptq directory.

I successfully obtained the vit_base_patch16_224.quant.onnx file and got the following evaluation accuracy:
The top-1 accuracy of the model is 84.51%.

To compare the quantization result with FP16 or FP32 of the same network, I ran the following commands to get the accuracy:

For FP32,

python evaluate.py --onnx_path=vit_base_patch16_224.onnx --imagenet_path=/data/imagenet  --quantize_mode=fp32   --model_name=vit_base_patch16_224

For FP16

python evaluate.py --onnx_path=vit_base_patch16_224.onnx --imagenet_path=/data/imagenet  --quantize_mode=fp16   --model_name=vit_base_patch16_224

Instead of using the vit_base_patch16_224.quant.onnx file, I used the original ONNX file (vit_base_patch16_224.onnx which can be downloaded by download_example_onnx.py script file) to create an engine and evaluate the accuracy.

For FP32, I got the following accuracy:
The top1 accuracy of the model is 58.16%

For FP16, I got the following accuracy:
The top1 accuracy of the model is 58.29%

Both FP16 and FP32 precision of the ViT network show 30% less accuracy compared to the quantized network (84.51%).

Why does this issue happen?

@chjej202 chjej202 changed the title 30% Increase in Accuracy After INT8 Quantization for the ViT Example in ONNX_PTQ FP16 and FP32 shows 30% lower accuracy compared to INT8 for the ViT Example in ONNX_PTQ Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants