NNCF Compressed Model Zoo

Ready-to-use Compressed LLMs can be found on OpenVINO Hugging Face page. Each model card includes NNCF parameters that were used to compress the model.

INT8 Post-Training Quantization (PTQ) results for public Vision, NLP and GenAI models can be found on OpenVino Performance Benchmarks page. PTQ results for ONNX models are available in the ONNX section below.

Quantization-Aware Training (QAT) results for PyTorch and TensorFlow public models can be found below.

PyTorch
TensorFlow
ONNX

PyTorch

PyTorch Classification

Model	Compression algorithm	Dataset	Accuracy (drop) %	Configuration	Checkpoint
GoogLeNet	-	ImageNet	69.77	Config	-
GoogLeNet	• Filter pruning: 40%, geometric median criterion	ImageNet	69.47 (0.30)	Config	Download
Inception V3	-	ImageNet	77.33	Config	-
Inception V3	• QAT: INT8	ImageNet	77.45 (-0.12)	Config	Download
Inception V3	• QAT: INT8 • Sparsity: 61% (RB)	ImageNet	76.36 (0.97)	Config	Download
MobileNet V2	-	ImageNet	71.87	Config	-
MobileNet V2	• QAT: INT8	ImageNet	71.07 (0.80)	Config	Download
MobileNet V2	• QAT: INT8 (per-tensor only)	ImageNet	71.24 (0.63)	Config	Download
MobileNet V2	• QAT: Mixed, 58.88% INT8 / 41.12% INT4	ImageNet	70.95 (0.92)	Config	Download
MobileNet V2	• QAT: INT8 • Sparsity: 52% (RB)	ImageNet	71.09 (0.78)	Config	Download
MobileNet V3 (Small)	-	ImageNet	67.66	Config	-
MobileNet V3 (Small)	• QAT: INT8	ImageNet	66.98 (0.68)	Config	Download
ResNet-18	• Filter pruning: 40%, magnitude criterion	ImageNet	69.27 (0.49)	Config	Download
ResNet-18	• Filter pruning: 40%, geometric median criterion	ImageNet	69.31 (0.45)	Config	Download
ResNet-18	• Accuracy-aware compressed training • Filter pruning: 60%, geometric median criterion	ImageNet	69.2 (-0.6)	Config	-
ResNet-34	-	ImageNet	73.30	Config	-
ResNet-34	• Filter pruning: 50%, geometric median criterion • Knowledge distillation	ImageNet	73.11 (0.19)	Config	Download
ResNet-50	-	ImageNet	76.15	Config	-
ResNet-50	• QAT: INT8	ImageNet	76.46 (-0.31)	Config	Download
ResNet-50	• QAT: INT8 (per-tensor only)	ImageNet	76.39 (-0.24)	Config	Download
ResNet-50	• QAT: Mixed, 43.12% INT8 / 56.88% INT4	ImageNet	76.05 (0.10)	Config	Download
ResNet-50	• QAT: INT8 • Sparsity: 61% (RB)	ImageNet	75.42 (0.73)	Config	Download
ResNet-50	• QAT: INT8 • Sparsity: 50% (RB)	ImageNet	75.50 (0.65)	Config	Download
ResNet-50	• Filter pruning: 40%, geometric median criterion	ImageNet	75.57 (0.58)	Config	Download
ResNet-50	• Accuracy-aware compressed training • Filter pruning: 52.5%, geometric median criterion	ImageNet	75.23 (0.93)	Config	-
SqueezeNet V1.1	-	ImageNet	58.19	Config	-
SqueezeNet V1.1	• QAT: INT8	ImageNet	58.22 (-0.03)	Config	Download
SqueezeNet V1.1	• QAT: INT8 (per-tensor only)	ImageNet	58.11 (0.08)	Config	Download
SqueezeNet V1.1	• QAT: Mixed, 52.83% INT8 / 47.17% INT4	ImageNet	57.57 (0.62)	Config	Download

PyTorch Object Detection

Model	Compression algorithm	Dataset	mAP (drop) %	Configuration	Checkpoint
SSD300‑MobileNet	-	VOC12+07 train, VOC07 eval	62.23	Config	Download
SSD300‑MobileNet	• QAT: INT8 • Sparsity: 70% (Magnitude)	VOC12+07 train, VOC07 eval	62.95 (-0.72)	Config	Download
SSD300‑VGG‑BN	-	VOC12+07 train, VOC07 eval	78.28	Config	Download
SSD300‑VGG‑BN	• QAT: INT8	VOC12+07 train, VOC07 eval	77.81 (0.47)	Config	Download
SSD300‑VGG‑BN	• QAT: INT8 • Sparsity: 70% (Magnitude)	VOC12+07 train, VOC07 eval	77.66 (0.62)	Config	Download
SSD300‑VGG‑BN	• Filter pruning: 40%, geometric median criterion	VOC12+07 train, VOC07 eval	78.35 (-0.07)	Config	Download
SSD512-VGG‑BN	-	VOC12+07 train, VOC07 eval	80.26	Config	Download
SSD512-VGG‑BN	• QAT: INT8	VOC12+07 train, VOC07 eval	80.04 (0.22)	Config	Download
SSD512-VGG‑BN	• QAT: INT8 • Sparsity: 70% (Magnitude)	VOC12+07 train, VOC07 eval	79.68 (0.58)	Config	Download

PyTorch Semantic Segmentation

Model	Compression algorithm	Dataset	mIoU (drop) %	Configuration	Checkpoint
ICNet	-	CamVid	67.89	Config	Download
ICNet	• QAT: INT8	CamVid	67.89 (0.00)	Config	Download
ICNet	• QAT: INT8 • Sparsity: 60% (Magnitude)	CamVid	67.16 (0.73)	Config	Download
UNet	-	CamVid	71.95	Config	Download
UNet	• QAT: INT8	CamVid	71.89 (0.06)	Config	Download
UNet	• QAT: INT8 • Sparsity: 60% (Magnitude)	CamVid	72.46 (-0.51)	Config	Download
UNet	-	Mapillary	56.24	Config	Download
UNet	• QAT: INT8	Mapillary	56.09 (0.15)	Config	Download
UNet	• QAT: INT8 • Sparsity: 60% (Magnitude)	Mapillary	55.69 (0.55)	Config	Download
UNet	• Filter pruning: 25%, geometric median criterion	Mapillary	55.64 (0.60)	Config	Download

PyTorch NLP (HuggingFace Transformers-powered models)

PyTorch Model	Compression algorithm	Dataset	Accuracy (drop) %
BERT-base-cased	• QAT: INT8	CoNLL2003	99.18 (-0.01)
BERT-base-cased	• QAT: INT8	MRPC	84.8 (-0.24)
BERT-base-chinese	• QAT: INT8	XNLI	77.22 (0.46)
BERT-large (Whole Word Masking)	• QAT: INT8	SQuAD v1.1	F1: 92.68 (0.53)
DistilBERT-base	• QAT: INT8	SST-2	90.3 (0.8)
GPT-2	• QAT: INT8	WikiText-2 (raw)	perplexity: 20.9 (-1.17)
MobileBERT	• QAT: INT8	SQuAD v1.1	F1: 89.4 (0.58)
RoBERTa-large	• QAT: INT8	MNLI	matched: 89.25 (1.35)

TensorFlow

TensorFlow Classification

Model	Compression algorithm	Dataset	Accuracy (drop) %	Configuration	Checkpoint
Inception V3	-	ImageNet	77.91	Config	-
Inception V3	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations)	ImageNet	78.39 (-0.48)	Config	Download
Inception V3	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 61% (RB)	ImageNet	77.52 (0.39)	Config	Download
Inception V3	• Sparsity: 54% (Magnitude)	ImageNet	77.86 (0.05)	Config	Download
MobileNet V2	-	ImageNet	71.85	Config	-
MobileNet V2	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations)	ImageNet	71.63 (0.22)	Config	Download
MobileNet V2	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 52% (RB)	ImageNet	70.94 (0.91)	Config	Download
MobileNet V2	• Sparsity: 50% (RB)	ImageNet	71.34 (0.51)	Config	Download
MobileNet V2 (TensorFlow Hub MobileNet V2)	• Sparsity: 35% (Magnitude)	ImageNet	71.87 (-0.02)	Config	Download
MobileNet V3 (Large)	-	ImageNet	75.80	Config	-
MobileNet V3 (Large)	• QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations)	ImageNet	75.04 (0.76)	Config	Download
MobileNet V3 (Large)	• QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 42% (RB)	ImageNet	75.24 (0.56)	Config	Download
MobileNet V3 (Small)	-	ImageNet	68.38	Config	-
MobileNet V3 (Small)	• QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations)	ImageNet	67.79 (0.59)	Config	Download
MobileNet V3 (Small)	• QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 42% (Magnitude)	ImageNet	67.44 (0.94)	Config	Download
ResNet-50	-	ImageNet	75.05	Config	-
ResNet-50	• QAT: INT8	ImageNet	74.99 (0.06)	Config	Download
ResNet-50	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 65% (RB)	ImageNet	74.36 (0.69)	Config	Download
ResNet-50	• Sparsity: 80% (RB)	ImageNet	74.38 (0.67)	Config	Download
ResNet-50	• Filter pruning: 40%, geometric median criterion	ImageNet	74.96 (0.09)	Config	Download
ResNet-50	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Filter pruning: 40%, geometric median criterion	ImageNet	75.09 (-0.04)	Config	Download
ResNet50	• Accuracy-aware compressed training • Sparsity: 65% (Magnitude)	ImageNet	74.37 (0.67)	Config	-

TensorFlow Object Detection

Model	Compression algorithm	Dataset	mAP (drop) %	Configuration	Checkpoint
RetinaNet	-	COCO 2017	33.43	Config	Download
RetinaNet	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations)	COCO 2017	33.12 (0.31)	Config	Download
RetinaNet	• Sparsity: 50% (Magnitude)	COCO 2017	33.10 (0.33)	Config	Download
RetinaNet	• Filter pruning: 40%	COCO 2017	32.72 (0.71)	Config	Download
RetinaNet	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Filter pruning: 40%	COCO 2017	32.67 (0.76)	Config	Download
YOLO v4	-	COCO 2017	47.07	Config	Download
YOLO v4	• QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations)	COCO 2017	46.20 (0.87)	Config	Download
YOLO v4	• Sparsity: 50% (Magnitude)	COCO 2017	46.49 (0.58)	Config	Download

TensorFlow Instance Segmentation

Model	Compression algorithm	Dataset	mAP (drop) %	Configuration	Checkpoint
Mask‑R‑CNN	-	COCO 2017	bbox: 37.33 segm: 33.56	Config	Download
Mask‑R‑CNN	• QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations)	COCO 2017	bbox: 37.19 (0.14) segm: 33.54 (0.02)	Config	Download
Mask‑R‑CNN	• Sparsity: 50% (Magnitude)	COCO 2017	bbox: 36.94 (0.39) segm: 33.23 (0.33)	Config	Download

ONNX

ONNX Classification

ONNX Model	Compression algorithm	Dataset	Accuracy (drop) %
DenseNet-121	PTQ	ImageNet	60.16 (0.8)
GoogleNet	PTQ	ImageNet	66.36 (0.3)
MobileNet V2	PTQ	ImageNet	71.38 (0.49)
ResNet-50	PTQ	ImageNet	74.63 (0.21)
ShuffleNet	PTQ	ImageNet	47.25 (0.18)
SqueezeNet V1.0	PTQ	ImageNet	54.3 (0.54)
VGG‑16	PTQ	ImageNet	72.02 (0.0)

ONNX Object Detection

ONNX Model	Compression algorithm	Dataset	mAP (drop) %
SSD1200	PTQ	COCO2017	20.17 (0.17)
Tiny-YOLOv2	PTQ	VOC12	29.03 (0.23)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ModelZoo.md

ModelZoo.md

NNCF Compressed Model Zoo

PyTorch

PyTorch Classification

PyTorch Object Detection

PyTorch Semantic Segmentation

PyTorch NLP (HuggingFace Transformers-powered models)

TensorFlow

TensorFlow Classification

TensorFlow Object Detection

TensorFlow Instance Segmentation

ONNX

ONNX Classification

ONNX Object Detection

Files

ModelZoo.md

Latest commit

History

ModelZoo.md

File metadata and controls

NNCF Compressed Model Zoo

PyTorch

PyTorch Classification

PyTorch Object Detection

PyTorch Semantic Segmentation

PyTorch NLP (HuggingFace Transformers-powered models)

TensorFlow

TensorFlow Classification

TensorFlow Object Detection

TensorFlow Instance Segmentation

ONNX

ONNX Classification

ONNX Object Detection