Name		Name	Last commit message	Last commit date
parent directory ..
include		include
plugin		plugin
src		src
windows		windows
CMakeLists.txt		CMakeLists.txt
README.md		README.md
demo.cpp		demo.cpp
gen_wts.py		gen_wts.py
images		images
yolov9_trt.py		yolov9_trt.py

README.md

YOLOv9

The Pytorch implementation is WongKinYiu/yolov9.

Contributors

Progress

Requirements

TensorRT 8.0+
OpenCV 3.4.0+

Speed Test

The speed test is done on a desktop with R7-5700G CPU and RTX 4060Ti GPU. The input size is 640x640. The FP32, FP16 and INT8 models are tested. The time only includes the inference time, not includes the pre-processing and post-processing. The time is the average of 1000 times inference.

frame	Model	FP32	FP16	INT8
tensorrt	YOLOv5-n	-ms	0.58ms	-ms
tensorrt	YOLOv5-s	-ms	0.90ms	-ms
tensorrt	YOLOv5-m	-ms	1.9ms	-ms
tensorrt	YOLOv5-l	-ms	2.8ms	-ms
tensorrt	YOLOv5-x	-ms	5.1ms	-ms
tensorrt	YOLOv9-t-convert	-ms	1.37ms	-ms
tensorrt	YOLOv9-s	-ms	1.78ms	-ms
tensorrt	YOLOv9-s-convert	-ms	1.78ms	-ms
tensorrt	YOLOv9-m	-ms	3.1ms	-ms
tensorrt	YOLOv9-m-convert	-ms	2.8ms	-ms
tensorrt	YOLOv9-c	13.5ms	4.6ms	3.0ms
tensorrt	YOLOv9-e	8.3ms	3.2ms	2.15ms

GELAN will be updated later.

YOLOv9-e is faster than YOLOv9-c in tensorrt, because the YOLOv9-e requires fewer layers of inference.

YOLOv9-c:
[[31, 34, 37, 16, 19, 22], 1, DualDDetect, [nc]] # [A3, A4, A5, P3, P4, P5]

YOLOv9-e:
[[35, 32, 29, 42, 45, 48], 1, DualDDetect, [nc]]

In DualDDetect, the A3, A4, A5, P3, P4, P5 are the output of the backbone. The first 3 layers are used for the inference of the final result.

The YOLOv9-c requires 37 layers of inference, but YOLOv9-e requires 35 layers of inference.

How to Run, yolov9 as example

generate .wts from pytorch with .pt, or download .wts from model zoo

// download https://github.com/WongKinYiu/yolov9
cp {tensorrtx}/yolov9/gen_wts.py {yolov9}/yolov9
cd {yolov9}/yolov9
python gen_wts.py
// a file 'yolov9.wts' will be generated.

build tensorrtx/yolov9 and run

cd {tensorrtx}/yolov9/
// update kNumClass in config.h if your model is trained on custom dataset
mkdir build
cd build
cp {ultralytics}/ultralytics/yolov9.wts {tensorrtx}/yolov9/build
cmake ..
make
sudo ./yolov9 -s [.wts] [.engine] [c/e]  // serialize model to plan file
sudo ./yolov9 -d [.engine] [image folder] // deserialize and run inference, the images in [image folder] will be processed.
// For example yolov9
sudo ./yolov9 -s yolov9-c.wts yolov9-c.engine c
sudo ./yolov9 -d yolov9-c.engine ../images

check the images generated, as follows. _zidane.jpg and _bus.jpg
optional, load and run the tensorrt model in python

// install python-tensorrt, pycuda, etc.
// ensure the yolov9.engine and libmyplugins.so have been built
python yolov9_trt.py

INT8 Quantization

Prepare calibration images, you can randomly select 1000s images from your train set. For coco, you can also download my calibration images coco_calib from GoogleDrive or BaiduPan pwd: a9wh
unzip it in yolov9/build
set the macro USE_INT8 in config.h and change the path of calibration images in config.h, such as 'gCalibTablePath="./coco_calib/";'
serialize the model and test

More Information

See the readme in home page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yolov9

yolov9

README.md

YOLOv9

Contributors

Progress

Requirements

Speed Test

How to Run, yolov9 as example

INT8 Quantization

More Information

Files

yolov9

Directory actions

More options

Directory actions

More options

Latest commit

History

yolov9

Folders and files

parent directory

README.md

YOLOv9

Contributors

Progress

Requirements

Speed Test

How to Run, yolov9 as example

INT8 Quantization

More Information