Skip to content

v6.1 - TensorRT, TensorFlow Edge TPU and OpenVINO Export and Inference

Compare
Choose a tag to compare
@glenn-jocher glenn-jocher released this 22 Feb 11:35
· 1053 commits to master since this release
3752807

This release incorporates many new features and bug fixes (271 PRs from 48 contributors) since our last release in October 2021. It adds TensorRT, Edge TPU and OpenVINO support, and provides retrained models at --batch-size 128 with new default one-cycle linear LR scheduler. YOLOv5 now officially supports 11 different formats, not just for export but for inference (both detect.py and PyTorch Hub), and validation to profile mAP and speed results after export.

Format export.py --include Model
PyTorch - yolov5s.pt
TorchScript torchscript yolov5s.torchscript
ONNX onnx yolov5s.onnx
OpenVINO openvino yolov5s_openvino_model/
TensorRT engine yolov5s.engine
CoreML coreml yolov5s.mlmodel
TensorFlow SavedModel saved_model yolov5s_saved_model/
TensorFlow GraphDef pb yolov5s.pb
TensorFlow Lite tflite yolov5s.tflite
TensorFlow Edge TPU edgetpu yolov5s_edgetpu.tflite
TensorFlow.js tfjs yolov5s_web_model/

Usage examples (ONNX shown):

Export:          python export.py --weights yolov5s.pt --include onnx
Detect:          python detect.py --weights yolov5s.onnx
PyTorch Hub:     model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.onnx')
Validate:        python val.py --weights yolov5s.onnx
Visualize:       https://netron.app

Important Updates

  • TensorRT support: TensorFlow, Keras, TFLite, TF.js model export now fully integrated using python export.py --include saved_model pb tflite tfjs (#5699 by @imyhxy)
  • Tensorflow Edge TPU support ⭐ NEW: New smaller YOLOv5n (1.9M params) model below YOLOv5s (7.5M params), exports to 2.1 MB INT8 size, ideal for ultralight mobile solutions. (#3630 by @zldrobit)
  • OpenVINO support: YOLOv5 ONNX models are now compatible with both OpenCV DNN and ONNX Runtime (#6057 by @glenn-jocher).
  • Export Benchmarks: Benchmark (mAP and speed) all YOLOv5 export formats with python utils/benchmarks.py --weights yolov5s.pt. Currently operates on CPU, future updates will implement GPU support. (#6613 by @glenn-jocher).
  • Architecture: no changes
  • Hyperparameters: minor change
  • Training: Default Learning Rate (LR) scheduler updated
    • One-cycle with cosine replace with one-cycle linear for improved results (#6729 by @glenn-jocher).

New Results

All model trainings logged to https://wandb.ai/glenn-jocher/YOLOv5_v61_official

YOLOv5-P5 640 Figure (click to expand)

Figure Notes (click to expand)
  • COCO AP val denotes [email protected]:0.95 metric measured on the 5000-image COCO val2017 dataset over various inference sizes from 256 to 1536.
  • GPU Speed measures average inference time per image on COCO val2017 dataset using a AWS p3.2xlarge V100 instance at batch-size 32.
  • EfficientDet data from google/automl at batch size 8.
  • Reproduce by python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n6.pt yolov5s6.pt yolov5m6.pt yolov5l6.pt yolov5x6.pt

Example YOLOv5l before and after metrics:

YOLOv5l
Large
size
(pixels)
mAPval
0.5:0.95
mAPval
0.5
Speed
CPU b1
(ms)
Speed
V100 b1
(ms)
Speed
V100 b32
(ms)
params
(M)
FLOPs
@640 (B)
v5.0 640 48.2 66.9 457.9 11.6 2.8 47.0 115.4
v6.0 (previous) 640 48.8 67.2 424.5 10.9 2.7 46.5 109.1
v6.1 (this release) 640 49.0 67.3 430.0 10.1 2.7 46.5 109.1

Pretrained Checkpoints

Model size
(pixels)
mAPval
0.5:0.95
mAPval
0.5
Speed
CPU b1
(ms)
Speed
V100 b1
(ms)
Speed
V100 b32
(ms)
params
(M)
FLOPs
@640 (B)
YOLOv5n 640 28.0 45.7 45 6.3 0.6 1.9 4.5
YOLOv5s 640 37.4 56.8 98 6.4 0.9 7.2 16.5
YOLOv5m 640 45.4 64.1 224 8.2 1.7 21.2 49.0
YOLOv5l 640 49.0 67.3 430 10.1 2.7 46.5 109.1
YOLOv5x 640 50.7 68.9 766 12.1 4.8 86.7 205.7
YOLOv5n6 1280 36.0 54.4 153 8.1 2.1 3.2 4.6
YOLOv5s6 1280 44.8 63.7 385 8.2 3.6 12.6 16.8
YOLOv5m6 1280 51.3 69.3 887 11.1 6.8 35.7 50.0
YOLOv5l6 1280 53.7 71.3 1784 15.8 10.5 76.8 111.4
YOLOv5x6
+ TTA
1280
1536
55.0
55.8
72.7
72.7
3136
-
26.2
-
19.4
-
140.7
-
209.8
-
Table Notes (click to expand)
  • All checkpoints are trained to 300 epochs with default settings. Nano and Small models use hyp.scratch-low.yaml hyps, all others use hyp.scratch-high.yaml.
  • mAPval values are for single-model single-scale on COCO val2017 dataset.
    Reproduce by python val.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65
  • Speed averaged over COCO val images using a AWS p3.2xlarge instance. NMS times (~1 ms/img) not included.
    Reproduce by python val.py --data coco.yaml --img 640 --task speed --batch 1
  • TTA Test Time Augmentation includes reflection and scale augmentations.
    Reproduce by python val.py --data coco.yaml --img 1536 --iou 0.7 --augment

Changelog

Changes between previous release and this release: v6.0...v6.1
Changes since this release: v6.1...HEAD

New Features and Bug Fixes (271)
New Contributors (48)

Full Changelog: v6.0...v6.1