Validated Models

Validated MLPerf Models

Model	Framework	Support	Example
ResNet50 V1.5	TensorFlow	Yes	Link
ResNet50 V1.5	PyTorch	Yes	Link
DLRM	PyTorch	Yes	Link
BERT large	TensorFlow	Yes	Link
BERT large	PyTorch	Yes	Link
SSD ResNet34	TensorFlow	Yes	Link
SSD ResNet34	PyTorch	Yes	Link
RNN-T	PyTorch	Yes	Link
3D-UNet	TensorFlow	Yes	Link
3D-UNet	PyTorch	Yes	Link

Validated Quantization Examples

Performance results test on 07/26/2022 with Intel Xeon Platinum 8380 Scalable processor, using 1 socket, 4 cores/instance, 10 instances and batch size 1.

Performance varies by use, configuration and other factors. See platform configuration for configuration details. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks

TensorFlow models with Intel TensorFlow 2.9.1

Model	Accuracy			Performance Throughput(samples/sec)			Example
Model	INT8	FP32	Accuracy Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
BERT large SQuAD	92.43	92.99	-0.60%	25.37	12.55	2.02x	pb
DenseNet121	73.57%	72.89%	0.93%	368.05	328.84	1.12x	pb
DenseNet161	76.24%	76.29%	-0.07%	219.08	179.2	1.22x	pb
DenseNet169	74.40%	74.65%	-0.33%	295.26	260.27	1.13x	pb
Faster R-CNN Inception ResNet V2	37.98%	38.33%	-0.91%	3.97	2.34	1.70x	pb
Faster R-CNN Inception ResNet V2	37.84%	38.33%	-1.28%	4	2.32	1.73x	SavedModel
Faster R-CNN ResNet101	30.28%	30.39%	-0.36%	70.32	20.19	3.48x	pb
Faster R-CNN ResNet101	30.37%	30.39%	-0.07%	70.3	17.1	4.11x	SavedModel
Faster R-CNN ResNet50	26.57%	26.59%	-0.08%	83.32	24.38	3.42x	pb
Inception ResNet V2	80.28%	80.40%	-0.15%	287.05	136.78	2.10x	pb
Inception V1	70.48%	69.74%	1.06%	2208.41	977.76	2.26x	pb
Inception V2	74.36%	73.97%	0.53%	1847.6	828.03	2.23x	pb
Inception V3	76.71%	76.75%	-0.05%	1036.18	373.61	2.77x	pb
Inception V4	80.20%	80.27%	-0.09%	592.46	206.96	2.86x	pb
Mask R-CNN Inception V2	28.53%	28.73%	-0.70%	132.07	51	2.59x	pb
Mask R-CNN Inception V2	28.53%	28.73%	-0.70%	132.41	50.94	2.60x	ckpt
MobileNet V1	71.79%	70.96%	1.17%	3603.94	1304.58	2.76x	pb
MobileNet V2	71.89%	71.76%	0.18%	2433.87	1446.1	1.68x	pb
ResNet101	77.50%	76.45%	1.37%	874.26	356.84	2.45x	pb
ResNet50 Fashion	78.06%	78.12%	-0.08%	3776.14	2160.52	1.75x	pb
ResNet50 V1.0	74.11%	74.27%	-0.22%	1511.74	459.43	3.29x	pb
ResNet50 V1.5	76.22%	76.46%	-0.31%	1355.03	423.41	3.20x	pb
ResNet V2 101	72.67%	71.87%	1.11%	436.34	323.15	1.35x	pb
ResNet V2 152	73.03%	72.37%	0.91%	311.93	222.83	1.40x	pb
ResNet V2 50	70.33%	69.64%	0.99%	766.83	574.76	1.33x	pb
SSD MobileNet V1	22.97%	23.13%	-0.69%	959.72	586.21	1.64x	pb
SSD MobileNet V1	22.99%	23.13%	-0.61%	953.4	412.06	2.31x	ckpt
SSD ResNet34	21.69%	22.09%	-1.81%	44.53	11.86	3.75x	pb
SSD ResNet50 V1	37.86%	38.00%	-0.37%	69.09	25.93	2.66x	pb
SSD ResNet50 V1	37.81%	38.00%	-0.50%	69.02	21.06	3.28x	ckpt
VGG16	72.66%	70.89%	2.50%	660.05	177.23	3.72x	pb
VGG19	72.72%	71.01%	2.41%	560.39	147.27	3.81x	pb
Wide & Deep	77.62%	77.67%	-0.07%	23329.18	20930.18	1.11x	pb

PyTorch models with Torch 1.12.0+cpu in PTQ mode

Model	Accuracy			Performance Throughput (samples/sec)			Example
Model	INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
ALBERT base MRPC	88.85%	88.50%	0.40%	34.46	26.88	1.28x	eager
Barthez MRPC	83.92%	83.81%	0.14%	161.06	89.61	1.80x	eager
BERT base COLA	58.80%	58.84%	-0.07%	262.88	125.63	2.09x	fx
BERT base MRPC	89.90%	90.69%	-0.88%	244.27	125.28	1.95x	fx
BERT base RTE	69.31%	69.68%	-0.52%	259.21	125.72	2.06x	fx
BERT base SST2	91.06%	91.86%	-0.87%	262.73	125.69	2.09x	fx
BERT base STSB	89.10%	89.75%	-0.72%	254.36	125.9	2.02x	fx
BERT large COLA	64.12%	62.57%	2.48%	89.36	36.47	2.45x	fx
BERT large MRPC	89.50%	90.38%	-0.97%	88.92	36.55	2.43x	fx
BERT large QNLI	90.90%	91.82%	-1.00%	90.39	36.63	2.47x	fx
CamemBERT base MRPC	86.70%	86.82%	-0.14%	236.6	121.81	1.94x	fx
Deberta MRPC	90.88%	90.91%	-0.04%	149.76	84.72	1.77x	eager
DistilBERT base MRPC	88.23%	89.16%	-1.05%	426.4	246.13	1.73x	eager
FlauBERT MRPC	79.87%	80.19%	-0.40%	675.82	437.72	1.54x	eager
Inception V3	69.43%	69.52%	-0.13%	490.32	209.87	2.34x	eager
Longformer MRPC	91.01%	91.46%	-0.49%	20.36	16.65	1.22x	eager
mBart WNLI	56.34%	56.34%	0.00%	66.23	30.86	2.15x	eager
lvwerra/pegasus-samsum	42.39	42.67	-0.67%	3.86	1.14	3.38x	eager
PeleeNet	71.64%	72.10%	-0.64%	511.56	387.9	1.32x	eager
ResNet18	69.57%	69.76%	-0.27%	823.22	386.93	2.13x	eager
ResNet18	69.57%	69.76%	-0.28%	816.8	385.23	2.12x	fx
ResNet50	75.98%	76.15%	-0.21%	515.14	204	2.53x	eager
ResNeXt101_32x8d	79.08%	79.31%	-0.29%	210.39	74.87	2.81x	eager
RNNT	92.48	92.55	-0.08%	74.17	20.38	3.64x	eager
Roberta Base MRPC	88.25%	88.18%	0.08%	245.05	123.53	1.98x	eager
Se_ResNeXt50_32x4d	78.98%	79.08%	-0.13%	370.11	172.45	2.15x	eager
SqueezeBERT MRPC	86.87%	87.65%	-0.89%	241.25	206.03	1.17x	eager
Transfo-xl MRPC	81.97%	81.20%	0.94%	11.2	8.31	1.35x	eager
xlm-roberta-base_MRPC	88.03%	88.62%	-0.67%	140.58	122.29	1.15x	eager
YOLOv3	24.60%	24.54%	0.21%	110.54	39.46	2.80x	eager

PyTorch models with Torch 1.12.0+cpu in QAT mode

Model	Accuracy			Performance Throughput (samples/sec)			Example
Model	INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
ResNet18	69.84%	69.76%	0.11%	805.76	389.14	2.07x	eager
ResNet18	69.74%	69.76%	-0.03%	822.99	391.82	2.10x	fx
BERT base MRPC QAT	89.70%	89.50%	0.22%	173.83	107.22	1.62x	fx
ResNet50	76.05%	76.15%	-0.13%	500.54	195.5	2.56x	eager

PyTorch models with Torch 1.12.0+cpu IPEX

Model	Accuracy			Performance Throughput (samples/sec) 1s 10ins 4c/ins bs=8			Performance Throughput (samples/sec) 1s 10ins 4c/ins bs=1			Example
Model	INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	INT8	FP32	INT8/FP32	Example
SSD ResNet34	19.99%	20.00%	-0.06%	11.19	8.72	1.28x	11.63	9.82	1.18x	ipex

PyTorch models with Torch 1.11.0+cpu IPEX

Model	Accuracy			Performance Throughput(samples/sec) 1s 10ins 4c/ins bs=64			Performance Throughput(samples/sec) 1s 10ins 4c/ins bs=1			Example
Model	INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	INT8	FP32	INT8/FP32	Example
ResNet18	69.48%	69.76%	-0.40%	3968.05	1182.36	3.36x	2995.8	1140.63	2.63x	ipex
ResNeXt101_32x16d_wsl	84.26%	84.17%	0.11%	195.72	51.28	3.82x	178.01	59.12	3.01x	ipex
ResNet50	76.07%	76.15%	-0.10%	1731.8	474.24	3.65x	1347.64	518.39	2.60x	ipex

ONNX Models with ONNX Runtime 1.11.0

Model	Accuracy			Performance Throughput(samples/sec)			Example
Model	INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
AlexNet	54.74%	54.79%	-0.09%	1498.07	649.99	2.30x	qdq
BERT base MRPC DYNAMIC	85.54%	86.03%	-0.57%	381.45	156.05	2.44x	qlinearops
BERT base MRPC STATIC	85.29%	86.03%	-0.86%	766.09	316.6	2.42x	qlinearops
BERT SQuAD	80.44	80.67	-0.29%	116.88	64.59	1.81x	qlinearops
BERT SQuAD	80.44	80.67	-0.29%	116.93	64.64	1.81x	qdq
BiDAF	65.92%	66.08%	-0.24%	1468.58	1406.21	1.04x	qlinearops
CaffeNet	56.26%	56.30%	-0.07%	2750.7	812.73	3.38x	qdq
DistilBERT base MRPC	84.56%	84.56%	0.00%	1654.95	595.32	2.78x	qlinearops
EfficientNet	77.58%	77.70%	-0.15%	2066.02	1096.86	1.88x	qlinearops
FCN	64.66%	64.98%	-0.49%	15.13	7.2	2.10x	qlinearops
GoogleNet	67.67%	67.79%	-0.18%	1174.98	807.68	1.45x	qdq
Inception V1	67.23%	67.24%	-0.01%	1181.71	831.01	1.42x	qdq
Mobile bert MRPC	86.03%	86.27%	-0.28%	774.96	678.66	1.14x	qlinearops
MobileBERT SQuAD MLPerf	89.84	90.03	-0.20%	104.51	94.88	1.10x	qlinearops
MobileNet V2	65.47%	66.89%	-2.12%	5172.04	3312.76	1.56x	qlinearops
MobileNet V3 MLPerf	75.59%	75.74%	-0.20%	4168.8	2146.59	1.94x	qlinearops
ResNet50 v1.5 MLPerf	76.13%	76.46%	-0.43%	1154.73	554.69	2.08x	qlinearops
ResNet50 V1.5	72.28%	72.29%	-0.01%	1156.05	555.72	2.08x	qlinearops
ResNet50 V1.5 (ONNX Model Zoo)	74.76%	74.99%	-0.31%	1347.89	588.84	2.29x	qlinearops
ResNet50 V1.5 (ONNX Model Zoo)	74.75%	74.99%	-0.32%	840.87	588.77	1.43x	qdq
Roberta Base MRPC	90.44%	89.95%	0.54%	811.74	312.93	2.59x	qlinearops
Tiny YOLOv3	12.08%	12.43%	-2.82%	801.46	653.42	1.23x	qlinearops
VGG16	66.60%	66.69%	-0.13%	312.98	128.7	2.43x	qlinearops
VGG16 (ONNX Model Zoo)	72.28%	72.40%	-0.17%	450.47	130.74	3.45x	qlinearops
YOLOv3	26.88%	28.74%	-6.47%	157.58	66.62	2.37x	qlinearops
ZFNet	55.89%	55.96%	-0.13%	658.93	359.42	1.83x	qdq

MXNet models with MXNet 1.7.0

Model	Accuracy			Performance Throughput(samples/sec)
Model	INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]
Inception V3	77.80%	77.65%	0.20%	922.38	277.59	3.32x
MobileNet V1	71.60%	72.23%	-0.86%	6614.69	2560.42	2.58x
MobileNet V3 MLPerf	70.80%	70.87%	-0.10%	5230.58	2024.85	2.58x
ResNet v1 152	78.28%	78.54%	-0.33%	578.27	156.38	3.70x
ResNet50 V1.0	75.91%	76.33%	-0.55%	1571.33	429.53	3.66x
SqueezeNet	56.80%	56.97%	-0.28%	4712.15	1323.68	3.56x
SSD MobileNet V1	74.94%	75.54%	-0.79%	768.59	191.55	4.01x

Validated Pruning Examples

Tasks	Framework	Model	FP32 Baseline	Gradient Sensitivity with 20% Sparsity			+ONNX Dynamic Quantization on Pruned Model
Tasks	Framework	Model	FP32 Baseline	Accuracy%	Drop	Perf Gain (sample/s)	Accuracy%	Drop	Perf Gain (sample/s)
SST-2	PyTorch	BERT base	accuracy = 92.32	accuracy = 91.97	-0.38	1.30x	accuracy = 92.20	-0.13	1.86x
QQP	PyTorch	BERT base	[accuracy, f1] = [91.10, 88.05]	[accuracy, f1] = [89.97, 86.54]	[-1.24, -1.71]	1.32x	[accuracy, f1] = [89.75, 86.60]	[-1.48, -1.65]	1.81x

Tasks	Framework	Model	FP32 Baseline	Pattern Lock on 70% Unstructured Sparsity		Pattern Lock on 50% 1:2 Structured Sparsity
Tasks	Framework	Model	FP32 Baseline	Accuracy%	Drop	Accuracy%	Drop
MNLI	PyTorch	BERT base	[m, mm] = [84.57, 84.79]	[m, mm] = [82.45, 83.27]	[-2.51, -1.80]	[m, mm] = [83.20, 84.11]	[-1.62, -0.80]
SST-2	PyTorch	BERT base	accuracy = 92.32	accuracy = 91.51	-0.88	accuracy = 92.20	-0.13
QQP	PyTorch	BERT base	[accuracy, f1] = [91.10, 88.05]	[accuracy, f1] = [90.48, 87.06]	[-0.68, -1.12]	[accuracy, f1] = [90.92, 87.78]	[-0.20, -0.31]
QNLI	PyTorch	BERT base	accuracy = 91.54	accuracy = 90.39	-1.26	accuracy = 90.87	-0.73
QnA	PyTorch	BERT base	[em, f1] = [79.34, 87.10]	[em, f1] = [77.27, 85.75]	[-2.61, -1.54]	[em, f1] = [78.03, 86.50]	[-1.65, -0.69]

Framework	Model	FP32 Baseline	Compression	Dataset	Accuracy% (Drop)
PyTorch	ResNet18	69.76	30% Sparsity on Magnitude	ImageNet	69.47(-0.42)
PyTorch	ResNet18	69.76	30% Sparsity on Gradient Sensitivity	ImageNet	68.85(-1.30)
PyTorch	ResNet50	76.13	30% Sparsity on Magnitude	ImageNet	76.11(-0.03)
PyTorch	ResNet50	76.13	30% Sparsity on Magnitude and Post Training Quantization	ImageNet	76.01(-0.16)
PyTorch	ResNet50	76.13	30% Sparsity on Magnitude and Quantization Aware Training	ImageNet	75.90(-0.30)

Validated Knowledge Distillation Examples

Example Name	Dataset	Student (Accuracy)	Teacher (Accuracy)	Student With Distillation (Accuracy Improvement)
Example Name	Dataset	Student (Accuracy)	Teacher (Accuracy)	Student With Distillation (Accuracy Improvement)
ResNet example	ImageNet	ResNet18 (0.6739)	ResNet50 (0.7399)	0.6845 (0.0106)
ResNet example	ImageNet	ResNet18 (0.6739)	ResNet50 (0.7399)	0.6845 (0.0106)
BlendCNN example	MRPC	BlendCNN (0.7034)	BERT-Base (0.8382)	0.7034 (0)
BlendCNN example	MRPC	BlendCNN (0.7034)	BERT-Base (0.8382)	0.7034 (0)
BiLSTM example	SST-2	BiLSTM (0.7913)	RoBERTa-Base (0.9404)	0.8085 (0.0172)
BiLSTM example	SST-2	BiLSTM (0.7913)	RoBERTa-Base (0.9404)	0.8085 (0.0172)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

validated_model_list.md

validated_model_list.md

Validated Models

Validated MLPerf Models

Validated Quantization Examples

TensorFlow models with Intel TensorFlow 2.9.1

PyTorch models with Torch 1.12.0+cpu in PTQ mode

PyTorch models with Torch 1.12.0+cpu in QAT mode

PyTorch models with Torch 1.12.0+cpu IPEX

PyTorch models with Torch 1.11.0+cpu IPEX

ONNX Models with ONNX Runtime 1.11.0

MXNet models with MXNet 1.7.0

Validated Pruning Examples

Validated Knowledge Distillation Examples

Files

validated_model_list.md

Latest commit

History

validated_model_list.md

File metadata and controls

Validated Models

Validated MLPerf Models

Validated Quantization Examples

TensorFlow models with Intel TensorFlow 2.9.1

PyTorch models with Torch 1.12.0+cpu in PTQ mode

PyTorch models with Torch 1.12.0+cpu in QAT mode

PyTorch models with Torch 1.12.0+cpu IPEX

PyTorch models with Torch 1.11.0+cpu IPEX

ONNX Models with ONNX Runtime 1.11.0

MXNet models with MXNet 1.7.0

Validated Pruning Examples

Validated Knowledge Distillation Examples