Releases: meituan/YOLOv6
YOLOv6-Segmentation
Features
- Release YOLOv6-Segmentation models at full scales
- Achieve the state-of-the-art accuracy in Real-time Instance Segmentation.
Performance of YOLOv6-seg models
Model | Size | mAPbox 50-95 |
mAPmask 50-95 |
SpeedT4 trt fp16 b1 (fps) |
Params (M) |
FLOPs (G) |
---|---|---|---|---|---|---|
YOLOv6-N | 640 | 35.3 | 31.2 | 645 | 4.9 | 7.0 |
YOLOv6-S | 640 | 44.0 | 38.0 | 292 | 19.6 | 27.7 |
YOLOv6-M | 640 | 48.2 | 41.3 | 148 | 37.1 | 54.3 |
YOLOv6-L | 640 | 51.1 | 43.7 | 93 | 63.6 | 95.5 |
YOLOv6-X | 640 | 52.2 | 44.8 | 47 | 119.1 | 175.5 |
Table Notes
- All checkpoints are trained from scratch on COCO for 300 epochs without distillation.
- Results of the mAP and speed are evaluated on COCO val2017 dataset with the input resolution of 640×640.
- Speed is tested with TensorRT 8.5 on T4 without post-processing.
YOLOv6 4.0
v4.0 release
Features
- Release YOLOv6Lite models on mobile or CPU.
- Update MBLABlock in the network structure.
- Update YOLOv6Lite-face models on mobile or CPU.
- Code reconstruction and normalization of convolution operators.
Performance of YOLOv6Lite models
Model | Size | mAPval 0.5:0.95 |
sm8350 (ms) |
mt6853 (ms) |
sdm660 (ms) |
Params (M) |
FLOPs (G) |
---|---|---|---|---|---|---|---|
YOLOv6Lite-S | 320*320 | 22.4 | 7.99 | 11.99 | 41.86 | 0.55 | 0.56 |
YOLOv6Lite-M | 320*320 | 25.1 | 9.08 | 13.27 | 47.95 | 0.79 | 0.67 |
YOLOv6Lite-L | 320*320 | 28.0 | 11.37 | 16.20 | 61.40 | 1.09 | 0.87 |
YOLOv6Lite-L | 320*192 | 25.0 | 7.02 | 9.66 | 36.13 | 1.09 | 0.52 |
YOLOv6Lite-L | 224*128 | 18.9 | 3.63 | 4.99 | 17.76 | 1.09 | 0.24 |
Table Notes
- From the perspective of model size and input image ratio, we have built a series of models on the mobile terminal to facilitate flexible applications in different scenarios.
- All checkpoints are trained with 400 epochs without distillation.
- Results of the mAP and speed are evaluated on COCO val2017 dataset, and the input resolution is the Size in the table.
- Speed is tested on MNN 2.3.0 AArch64 with 2 threads by arm82 acceleration. The inference warm-up is performed 10 times, and the cycle is performed 100 times.
- Qualcomm 888(sm8350), Dimensity 720(mt6853) and Qualcomm 660(sdm660) correspond to chips with different performances at the high, middle and low end respectively, which can be used as a reference for model capabilities under different chips.
- Refer to Test NCNN Speed tutorial to reproduce the NCNN speed results of YOLOv6Lite.
Performance of YOLOv6_MBLA models
Model | Size | mAPval 0.5:0.95 |
SpeedT4 trt fp16 b1 (fps) |
SpeedT4 trt fp16 b32 (fps) |
Params (M) |
FLOPs (G) |
---|---|---|---|---|---|---|
YOLOv6-S-mbla | 640 | 47.0distill | 300 | 424 | 11.6 | 29.8 |
YOLOv6-M-mbla | 640 | 50.3distill | 168 | 216 | 26.1 | 66.7 |
YOLOv6-L-mbla | 640 | 52.0distill | 129 | 154 | 46.3 | 118.2 |
YOLOv6-X-mbla | 640 | 53.5distill | 78 | 94 | 78.8 | 199.0 |
Table Notes
- Speed is tested with TensorRT 8.4.2.4 on T4.
- The processes of model training, evaluation, and inference are the same as the original ones. For details, please refer to this README.
YOLOv6-Face
Features
- Face detection and landmarks localization
- Repulsion loss
- Same-channel Dehead
Performance on WIDERFACE
Model | Size | Easy | Medium | Hard | SpeedT4 trt fp16 b1 (fps) |
SpeedT4 trt fp16 b32 (fps) |
Params (M) |
FLOPs (G) |
---|---|---|---|---|---|---|---|---|
YOLOv6-N | 640 | 95.0 | 92.4 | 80.4 | 797 | 1313 | 4.63 | 11.35 |
YOLOv6-S | 640 | 96.2 | 94.7 | 85.1 | 339 | 484 | 12.41 | 32.45 |
YOLOv6-M | 640 | 97.0 | 95.3 | 86.3 | 188 | 240 | 24.85 | 70.59 |
YOLOv6-L | 640 | 97.2 | 95.9 | 87.5 | 102 | 121 | 56.77 | 159.24 |
- All checkpoints are fine-tuned from COCO pretrained model for 300 epochs without distillation.
- Results of the mAP and speed are evaluated on WIDER FACE dataset with the input resolution of 640×640.
- Speed is tested with TensorRT 8.2 on T4.
- Refer to Test speed tutorial to reproduce the speed results of YOLOv6.
- Params and FLOPs of YOLOv6 are estimated on deployed models.
YOLOv6 3.0
v3.0 release
Features
Release P6 models and update P5 models
- Renew the neck of the detector with a BiC module and SimCSPSPPF Block.
- Propose an anchor-aided training (AAT) strategy.
- Involve a new self-distillation strategy for small models of YOLOv6.
- Expand YOLOv6 and hit a new SOTA performance on the COCO dataset.
Performance
Model | Size | mAPval 0.5:0.95 |
SpeedT4 trt fp16 b1 (fps) |
SpeedT4 trt fp16 b32 (fps) |
Params (M) |
FLOPs (G) |
---|---|---|---|---|---|---|
YOLOv6-N | 640 | 37.5 | 779 | 1187 | 4.7 | 11.4 |
YOLOv6-S | 640 | 45.0 | 339 | 484 | 18.5 | 45.3 |
YOLOv6-M | 640 | 50.0 | 175 | 226 | 34.9 | 85.8 |
YOLOv6-L | 640 | 52.8 | 98 | 116 | 59.6 | 150.7 |
YOLOv6-N6 | 1280 | 44.9 | 228 | 281 | 10.4 | 49.8 |
YOLOv6-S6 | 1280 | 50.3 | 98 | 108 | 41.4 | 198.0 |
YOLOv6-M6 | 1280 | 55.2 | 47 | 55 | 79.6 | 379.5 |
YOLOv6-L6 | 1280 | 57.2 | 26 | 29 | 140.4 | 673.4 |
Performance of base models
Model | Size | mAPval 0.5:0.95 |
SpeedT4 TRT FP16 b1 (FPS) |
SpeedT4 TRT FP16 b32 (FPS) |
SpeedT4 TRT INT8 b1 (FPS) |
SpeedT4 TRT INT8 b32 (FPS) |
Params (M) |
FLOPs (G) |
---|---|---|---|---|---|---|---|---|
YOLOv6-N-base | 640 | 36.6 | 727 | 1302 | 814 | 1805 | 4.65 | 11.46 |
YOLOv6-S-base | 640 | 45.3 | 346 | 525 | 487 | 908 | 13.14 | 30.6 |
YOLOv6-M-base | 640 | 49.4 | 179 | 245 | 284 | 439 | 28.33 | 72.30 |
YOLOv6-L-base | 640 | 51.1 | 116 | 157 | 196 | 288 | 59.61 | 150.89 |
YOLOv6 2.1
v2.1 release
Features
Release base models
-
Use only regular convolution and Relu activation functions.
-
Apply CSP (1/2 channel dim) blocks in the network structure, except for Nano base model.
Advantage:
- Adopt a unified network structure and configuration, and the accuracy loss of the PTQ 8-bit quantization model is negligible, about 0.4%.
- Suitable for users who are just getting started or who need to apply, optimize and deploy an 8-bit quantization model quickly and frequently.
Shortcoming:
- The accuracy on COCO is slightly lower than the v2.0 released models.
Performance
Model | Size | mAPval 0.5:0.95 |
SpeedT4 trt fp16 b1 (fps) |
SpeedT4 trt fp16 b32 (fps) |
Params (M) |
FLOPs (G) |
---|---|---|---|---|---|---|
YOLOv6-N-base | 640 | 35.6400e | 832 | 1249 | 4.3 | 11.1 |
YOLOv6-S-base | 640 | 43.8400e | 373 | 531 | 11.5 | 27.6 |
YOLOv6-M-base | 640 | 48.8distill | 179 | 246 | 27.7 | 68.4 |
YOLOv6-L-base | 640 | 51.0distill | 115 | 153 | 58.5 | 144.0 |
YOLOv6 2.0
v2.0 release
YOLOv6 has a series of models for various industrial scenarios, including nano/tiny/s/m/l, which the architectures vary considering the model size for better accuracy-speed trade-off. And some Bag-of-freebies methods are introduced to further improve the performance, such as self-distillation and more training epochs. For industrial deployment, we adopt QAT with channel-wise distillation and graph optimization to pursue extreme performance.
New Features
- Release M/L models and update N/T/S models with enhanced performance.⭐️ Benchmark
- 2x faster training time.
- Fix the degration of performance when evaluating on 640x640 inputs.
- Customized quantization methods. 🚀 Quantization Tutorial
YOLOv6 1.0
v1.0 release
Features
YOLOv6 is a single-stage object detection framework dedicated to industrial application, with hardware-friendly efficient design and high performance, outperforming YOLOv5, YOLOX and PP-YOLOE.
YOLOv6-nano achieves 35.0 mAP on COCO val2017 dataset with 1242 FPS on T4 using TensorRT FP16 for bs32 inference, and YOLOv6-s achieves 43.1 mAP on COCO val2017 dataset with 520 FPS on T4 using TensorRT FP16 for bs32 inference.
- Hardware-friendly Design for Backbone and Neck
- Efficient Decoupled Head with SIoU Loss