Model | Framework | Support | Example |
---|---|---|---|
ResNet50 V1.5 | TensorFlow | Yes | Link |
PyTorch | Yes | Link | |
DLRM | PyTorch | Yes | Link |
BERT large | TensorFlow | Yes | Link |
PyTorch | Yes | Link | |
SSD ResNet34 | TensorFlow | Yes | Link |
PyTorch | Yes | Link | |
RNN-T | PyTorch | Yes | Link |
3D-UNet | TensorFlow | Yes | Link |
PyTorch | Yes | Link |
Performance results test on 07/26/2022 with Intel Xeon Platinum 8380 Scalable processor, using 1 socket, 4 cores/instance, 10 instances and batch size 1.
Performance varies by use, configuration and other factors. See platform configuration for configuration details. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks
Model | Accuracy | Performance Throughput(samples/sec) |
Example | ||||
---|---|---|---|---|---|---|---|
INT8 | FP32 | Accuracy Ratio[(INT8-FP32)/FP32] | INT8 | FP32 | Performance Ratio[INT8/FP32] | ||
BERT large SQuAD | 92.43 | 92.99 | -0.60% | 25.37 | 12.55 | 2.02x | pb |
DenseNet121 | 73.57% | 72.89% | 0.93% | 368.05 | 328.84 | 1.12x | pb |
DenseNet161 | 76.24% | 76.29% | -0.07% | 219.08 | 179.2 | 1.22x | pb |
DenseNet169 | 74.40% | 74.65% | -0.33% | 295.26 | 260.27 | 1.13x | pb |
Faster R-CNN Inception ResNet V2 | 37.98% | 38.33% | -0.91% | 3.97 | 2.34 | 1.70x | pb |
Faster R-CNN Inception ResNet V2 | 37.84% | 38.33% | -1.28% | 4 | 2.32 | 1.73x | SavedModel |
Faster R-CNN ResNet101 | 30.28% | 30.39% | -0.36% | 70.32 | 20.19 | 3.48x | pb |
Faster R-CNN ResNet101 | 30.37% | 30.39% | -0.07% | 70.3 | 17.1 | 4.11x | SavedModel |
Faster R-CNN ResNet50 | 26.57% | 26.59% | -0.08% | 83.32 | 24.38 | 3.42x | pb |
Inception ResNet V2 | 80.28% | 80.40% | -0.15% | 287.05 | 136.78 | 2.10x | pb |
Inception V1 | 70.48% | 69.74% | 1.06% | 2208.41 | 977.76 | 2.26x | pb |
Inception V2 | 74.36% | 73.97% | 0.53% | 1847.6 | 828.03 | 2.23x | pb |
Inception V3 | 76.71% | 76.75% | -0.05% | 1036.18 | 373.61 | 2.77x | pb |
Inception V4 | 80.20% | 80.27% | -0.09% | 592.46 | 206.96 | 2.86x | pb |
Mask R-CNN Inception V2 | 28.53% | 28.73% | -0.70% | 132.07 | 51 | 2.59x | pb |
Mask R-CNN Inception V2 | 28.53% | 28.73% | -0.70% | 132.41 | 50.94 | 2.60x | ckpt |
MobileNet V1 | 71.79% | 70.96% | 1.17% | 3603.94 | 1304.58 | 2.76x | pb |
MobileNet V2 | 71.89% | 71.76% | 0.18% | 2433.87 | 1446.1 | 1.68x | pb |
ResNet101 | 77.50% | 76.45% | 1.37% | 874.26 | 356.84 | 2.45x | pb |
ResNet50 Fashion | 78.06% | 78.12% | -0.08% | 3776.14 | 2160.52 | 1.75x | pb |
ResNet50 V1.0 | 74.11% | 74.27% | -0.22% | 1511.74 | 459.43 | 3.29x | pb |
ResNet50 V1.5 | 76.22% | 76.46% | -0.31% | 1355.03 | 423.41 | 3.20x | pb |
ResNet V2 101 | 72.67% | 71.87% | 1.11% | 436.34 | 323.15 | 1.35x | pb |
ResNet V2 152 | 73.03% | 72.37% | 0.91% | 311.93 | 222.83 | 1.40x | pb |
ResNet V2 50 | 70.33% | 69.64% | 0.99% | 766.83 | 574.76 | 1.33x | pb |
SSD MobileNet V1 | 22.97% | 23.13% | -0.69% | 959.72 | 586.21 | 1.64x | pb |
SSD MobileNet V1 | 22.99% | 23.13% | -0.61% | 953.4 | 412.06 | 2.31x | ckpt |
SSD ResNet34 | 21.69% | 22.09% | -1.81% | 44.53 | 11.86 | 3.75x | pb |
SSD ResNet50 V1 | 37.86% | 38.00% | -0.37% | 69.09 | 25.93 | 2.66x | pb |
SSD ResNet50 V1 | 37.81% | 38.00% | -0.50% | 69.02 | 21.06 | 3.28x | ckpt |
VGG16 | 72.66% | 70.89% | 2.50% | 660.05 | 177.23 | 3.72x | pb |
VGG19 | 72.72% | 71.01% | 2.41% | 560.39 | 147.27 | 3.81x | pb |
Wide & Deep | 77.62% | 77.67% | -0.07% | 23329.18 | 20930.18 | 1.11x | pb |
Model | Accuracy | Performance Throughput (samples/sec) |
Example | ||||
---|---|---|---|---|---|---|---|
INT8 | FP32 | Acc Ratio[(INT8-FP32)/FP32] | INT8 | FP32 | Performance Ratio[INT8/FP32] | ||
ALBERT base MRPC | 88.85% | 88.50% | 0.40% | 34.46 | 26.88 | 1.28x | eager |
Barthez MRPC | 83.92% | 83.81% | 0.14% | 161.06 | 89.61 | 1.80x | eager |
BERT base COLA | 58.80% | 58.84% | -0.07% | 262.88 | 125.63 | 2.09x | fx |
BERT base MRPC | 89.90% | 90.69% | -0.88% | 244.27 | 125.28 | 1.95x | fx |
BERT base RTE | 69.31% | 69.68% | -0.52% | 259.21 | 125.72 | 2.06x | fx |
BERT base SST2 | 91.06% | 91.86% | -0.87% | 262.73 | 125.69 | 2.09x | fx |
BERT base STSB | 89.10% | 89.75% | -0.72% | 254.36 | 125.9 | 2.02x | fx |
BERT large COLA | 64.12% | 62.57% | 2.48% | 89.36 | 36.47 | 2.45x | fx |
BERT large MRPC | 89.50% | 90.38% | -0.97% | 88.92 | 36.55 | 2.43x | fx |
BERT large QNLI | 90.90% | 91.82% | -1.00% | 90.39 | 36.63 | 2.47x | fx |
CamemBERT base MRPC | 86.70% | 86.82% | -0.14% | 236.6 | 121.81 | 1.94x | fx |
Deberta MRPC | 90.88% | 90.91% | -0.04% | 149.76 | 84.72 | 1.77x | eager |
DistilBERT base MRPC | 88.23% | 89.16% | -1.05% | 426.4 | 246.13 | 1.73x | eager |
FlauBERT MRPC | 79.87% | 80.19% | -0.40% | 675.82 | 437.72 | 1.54x | eager |
Inception V3 | 69.43% | 69.52% | -0.13% | 490.32 | 209.87 | 2.34x | eager |
Longformer MRPC | 91.01% | 91.46% | -0.49% | 20.36 | 16.65 | 1.22x | eager |
mBart WNLI | 56.34% | 56.34% | 0.00% | 66.23 | 30.86 | 2.15x | eager |
lvwerra/pegasus-samsum | 42.39 | 42.67 | -0.67% | 3.86 | 1.14 | 3.38x | eager |
PeleeNet | 71.64% | 72.10% | -0.64% | 511.56 | 387.9 | 1.32x | eager |
ResNet18 | 69.57% | 69.76% | -0.27% | 823.22 | 386.93 | 2.13x | eager |
ResNet18 | 69.57% | 69.76% | -0.28% | 816.8 | 385.23 | 2.12x | fx |
ResNet50 | 75.98% | 76.15% | -0.21% | 515.14 | 204 | 2.53x | eager |
ResNeXt101_32x8d | 79.08% | 79.31% | -0.29% | 210.39 | 74.87 | 2.81x | eager |
RNNT | 92.48 | 92.55 | -0.08% | 74.17 | 20.38 | 3.64x | eager |
Roberta Base MRPC | 88.25% | 88.18% | 0.08% | 245.05 | 123.53 | 1.98x | eager |
Se_ResNeXt50_32x4d | 78.98% | 79.08% | -0.13% | 370.11 | 172.45 | 2.15x | eager |
SqueezeBERT MRPC | 86.87% | 87.65% | -0.89% | 241.25 | 206.03 | 1.17x | eager |
Transfo-xl MRPC | 81.97% | 81.20% | 0.94% | 11.2 | 8.31 | 1.35x | eager |
xlm-roberta-base_MRPC | 88.03% | 88.62% | -0.67% | 140.58 | 122.29 | 1.15x | eager |
YOLOv3 | 24.60% | 24.54% | 0.21% | 110.54 | 39.46 | 2.80x | eager |
Model | Accuracy | Performance Throughput (samples/sec) |
Example | ||||
---|---|---|---|---|---|---|---|
INT8 | FP32 | Acc Ratio[(INT8-FP32)/FP32] | INT8 | FP32 | Performance Ratio[INT8/FP32] | ||
ResNet18 | 69.84% | 69.76% | 0.11% | 805.76 | 389.14 | 2.07x | eager |
ResNet18 | 69.74% | 69.76% | -0.03% | 822.99 | 391.82 | 2.10x | fx |
BERT base MRPC QAT | 89.70% | 89.50% | 0.22% | 173.83 | 107.22 | 1.62x | fx |
ResNet50 | 76.05% | 76.15% | -0.13% | 500.54 | 195.5 | 2.56x | eager |
Model | Accuracy | Performance Throughput (samples/sec) 1s 10ins 4c/ins bs=8 |
Performance Throughput (samples/sec) 1s 10ins 4c/ins bs=1 |
Example | ||||||
---|---|---|---|---|---|---|---|---|---|---|
INT8 | FP32 | Acc Ratio[(INT8-FP32)/FP32] | INT8 | FP32 | Performance Ratio[INT8/FP32] | INT8 | FP32 | INT8/FP32 | ||
SSD ResNet34 | 19.99% | 20.00% | -0.06% | 11.19 | 8.72 | 1.28x | 11.63 | 9.82 | 1.18x | ipex |
Model | Accuracy | Performance Throughput(samples/sec) 1s 10ins 4c/ins bs=64 |
Performance Throughput(samples/sec) 1s 10ins 4c/ins bs=1 |
Example | ||||||
---|---|---|---|---|---|---|---|---|---|---|
INT8 | FP32 | Acc Ratio[(INT8-FP32)/FP32] | INT8 | FP32 | Performance Ratio[INT8/FP32] | INT8 | FP32 | INT8/FP32 | ||
ResNet18 | 69.48% | 69.76% | -0.40% | 3968.05 | 1182.36 | 3.36x | 2995.8 | 1140.63 | 2.63x | ipex |
ResNeXt101_32x16d_wsl | 84.26% | 84.17% | 0.11% | 195.72 | 51.28 | 3.82x | 178.01 | 59.12 | 3.01x | ipex |
ResNet50 | 76.07% | 76.15% | -0.10% | 1731.8 | 474.24 | 3.65x | 1347.64 | 518.39 | 2.60x | ipex |
Model | Accuracy | Performance Throughput(samples/sec) |
Example | ||||
---|---|---|---|---|---|---|---|
INT8 | FP32 | Acc Ratio[(INT8-FP32)/FP32] | INT8 | FP32 | Performance Ratio[INT8/FP32] | ||
AlexNet | 54.74% | 54.79% | -0.09% | 1498.07 | 649.99 | 2.30x | qdq |
BERT base MRPC DYNAMIC | 85.54% | 86.03% | -0.57% | 381.45 | 156.05 | 2.44x | qlinearops |
BERT base MRPC STATIC | 85.29% | 86.03% | -0.86% | 766.09 | 316.6 | 2.42x | qlinearops |
BERT SQuAD | 80.44 | 80.67 | -0.29% | 116.88 | 64.59 | 1.81x | qlinearops |
BERT SQuAD | 80.44 | 80.67 | -0.29% | 116.93 | 64.64 | 1.81x | qdq |
BiDAF | 65.92% | 66.08% | -0.24% | 1468.58 | 1406.21 | 1.04x | qlinearops |
CaffeNet | 56.26% | 56.30% | -0.07% | 2750.7 | 812.73 | 3.38x | qdq |
DistilBERT base MRPC | 84.56% | 84.56% | 0.00% | 1654.95 | 595.32 | 2.78x | qlinearops |
EfficientNet | 77.58% | 77.70% | -0.15% | 2066.02 | 1096.86 | 1.88x | qlinearops |
FCN | 64.66% | 64.98% | -0.49% | 15.13 | 7.2 | 2.10x | qlinearops |
GoogleNet | 67.67% | 67.79% | -0.18% | 1174.98 | 807.68 | 1.45x | qdq |
Inception V1 | 67.23% | 67.24% | -0.01% | 1181.71 | 831.01 | 1.42x | qdq |
Mobile bert MRPC | 86.03% | 86.27% | -0.28% | 774.96 | 678.66 | 1.14x | qlinearops |
MobileBERT SQuAD MLPerf | 89.84 | 90.03 | -0.20% | 104.51 | 94.88 | 1.10x | qlinearops |
MobileNet V2 | 65.47% | 66.89% | -2.12% | 5172.04 | 3312.76 | 1.56x | qlinearops |
MobileNet V3 MLPerf | 75.59% | 75.74% | -0.20% | 4168.8 | 2146.59 | 1.94x | qlinearops |
ResNet50 v1.5 MLPerf | 76.13% | 76.46% | -0.43% | 1154.73 | 554.69 | 2.08x | qlinearops |
ResNet50 V1.5 | 72.28% | 72.29% | -0.01% | 1156.05 | 555.72 | 2.08x | qlinearops |
ResNet50 V1.5 (ONNX Model Zoo) | 74.76% | 74.99% | -0.31% | 1347.89 | 588.84 | 2.29x | qlinearops |
ResNet50 V1.5 (ONNX Model Zoo) | 74.75% | 74.99% | -0.32% | 840.87 | 588.77 | 1.43x | qdq |
Roberta Base MRPC | 90.44% | 89.95% | 0.54% | 811.74 | 312.93 | 2.59x | qlinearops |
Tiny YOLOv3 | 12.08% | 12.43% | -2.82% | 801.46 | 653.42 | 1.23x | qlinearops |
VGG16 | 66.60% | 66.69% | -0.13% | 312.98 | 128.7 | 2.43x | qlinearops |
VGG16 (ONNX Model Zoo) | 72.28% | 72.40% | -0.17% | 450.47 | 130.74 | 3.45x | qlinearops |
YOLOv3 | 26.88% | 28.74% | -6.47% | 157.58 | 66.62 | 2.37x | qlinearops |
ZFNet | 55.89% | 55.96% | -0.13% | 658.93 | 359.42 | 1.83x | qdq |
Model | Accuracy | Performance Throughput(samples/sec) |
||||
---|---|---|---|---|---|---|
INT8 | FP32 | Acc Ratio[(INT8-FP32)/FP32] | INT8 | FP32 | Performance Ratio[INT8/FP32] | |
Inception V3 | 77.80% | 77.65% | 0.20% | 922.38 | 277.59 | 3.32x |
MobileNet V1 | 71.60% | 72.23% | -0.86% | 6614.69 | 2560.42 | 2.58x |
MobileNet V3 MLPerf | 70.80% | 70.87% | -0.10% | 5230.58 | 2024.85 | 2.58x |
ResNet v1 152 | 78.28% | 78.54% | -0.33% | 578.27 | 156.38 | 3.70x |
ResNet50 V1.0 | 75.91% | 76.33% | -0.55% | 1571.33 | 429.53 | 3.66x |
SqueezeNet | 56.80% | 56.97% | -0.28% | 4712.15 | 1323.68 | 3.56x |
SSD MobileNet V1 | 74.94% | 75.54% | -0.79% | 768.59 | 191.55 | 4.01x |
Tasks | Framework | Model | FP32 Baseline | Gradient Sensitivity with 20% Sparsity | +ONNX Dynamic Quantization on Pruned Model | ||||
---|---|---|---|---|---|---|---|---|---|
Accuracy% | Drop | Perf Gain (sample/s) | Accuracy% | Drop | Perf Gain (sample/s) | ||||
SST-2 | PyTorch | BERT base | accuracy = 92.32 | accuracy = 91.97 | -0.38 | 1.30x | accuracy = 92.20 | -0.13 | 1.86x |
QQP | PyTorch | BERT base | [accuracy, f1] = [91.10, 88.05] | [accuracy, f1] = [89.97, 86.54] | [-1.24, -1.71] | 1.32x | [accuracy, f1] = [89.75, 86.60] | [-1.48, -1.65] | 1.81x |
Tasks | Framework | Model | FP32 Baseline | Pattern Lock on 70% Unstructured Sparsity | Pattern Lock on 50% 1:2 Structured Sparsity | ||
---|---|---|---|---|---|---|---|
Accuracy% | Drop | Accuracy% | Drop | ||||
MNLI | PyTorch | BERT base | [m, mm] = [84.57, 84.79] | [m, mm] = [82.45, 83.27] | [-2.51, -1.80] | [m, mm] = [83.20, 84.11] | [-1.62, -0.80] |
SST-2 | PyTorch | BERT base | accuracy = 92.32 | accuracy = 91.51 | -0.88 | accuracy = 92.20 | -0.13 |
QQP | PyTorch | BERT base | [accuracy, f1] = [91.10, 88.05] | [accuracy, f1] = [90.48, 87.06] | [-0.68, -1.12] | [accuracy, f1] = [90.92, 87.78] | [-0.20, -0.31] |
QNLI | PyTorch | BERT base | accuracy = 91.54 | accuracy = 90.39 | -1.26 | accuracy = 90.87 | -0.73 |
QnA | PyTorch | BERT base | [em, f1] = [79.34, 87.10] | [em, f1] = [77.27, 85.75] | [-2.61, -1.54] | [em, f1] = [78.03, 86.50] | [-1.65, -0.69] |
Framework | Model | FP32 Baseline | Compression | Dataset | Accuracy% (Drop) |
---|---|---|---|---|---|
PyTorch | ResNet18 | 69.76 | 30% Sparsity on Magnitude | ImageNet | 69.47(-0.42) |
PyTorch | ResNet18 | 69.76 | 30% Sparsity on Gradient Sensitivity | ImageNet | 68.85(-1.30) |
PyTorch | ResNet50 | 76.13 | 30% Sparsity on Magnitude | ImageNet | 76.11(-0.03) |
PyTorch | ResNet50 | 76.13 | 30% Sparsity on Magnitude and Post Training Quantization | ImageNet | 76.01(-0.16) |
PyTorch | ResNet50 | 76.13 | 30% Sparsity on Magnitude and Quantization Aware Training | ImageNet | 75.90(-0.30) |
Example Name | Dataset | Student (Accuracy) |
Teacher (Accuracy) |
Student With Distillation (Accuracy Improvement) |
---|---|---|---|---|
ResNet example | ImageNet | ResNet18 (0.6739) |
ResNet50 (0.7399) |
0.6845 (0.0106) |
BlendCNN example | MRPC | BlendCNN (0.7034) |
BERT-Base (0.8382) |
0.7034 (0) |
BiLSTM example | SST-2 | BiLSTM (0.7913) |
RoBERTa-Base (0.9404) |
0.8085 (0.0172) |