TensorRT Custom Layer

TensorRTのエンジンに自作のpluginを組み込んで自作したカーネルが推論で使用されるように変更してみました
今回はdepthwiseConvolutionを対象としました
Custom Layerを実装する際に参考にしてみてください

使用したネットワーク

pluginのbuild方法

depthwiseConvolutionPlugin/CMakeLists.txtの46行目を使用したデバイスに適した値に変更してください
$<$<COMPILE_LANGUAGE:CUDA>:-O3 -gencode arch=compute_xx,code=sm_xx>

cd depthwiseConvolutionPlugin
mkdir build
cd build
cmake ..
make

depthwiseConvolutionPlugin/buildにlibdepthwiseConvolutionPlugin.soが作成されます

onnx graphの作成

python make_onnx.py

onnx graphの変更

python change_onnx.py

onnxグラフのノードの名前をpluginの名前に変更することで自作のpluginが使われるようになる

tensorRT engineのbuild

bash build_trt.sh

tensorRT engineの実行

bash run_trt.sh

test

python test.py

pluginを使ったenginと使用していないenginの結果を比較

result

	使用カーネル	average time[us]	tortal time[us]
TensorRT with my plugin	dw_conv_fp16	3.77	3.77
TensorRT	sm50_xmma_convolution_depthwiseHMMA_FP16NHWCx8_TR3_TS3_STRIDEH1_STRIDEW1 + generatedNativePointwise	3.50 + 2.66	6.16

leakyreluの部分は何故かgeneratedNativePointwiseという名前のカーネルが呼ばれている..
depthwiseHMMAはどのようにHMMAを使っているのだろうか...

実行環境

GPU: NVIDIA GeForce RTX 2080 SUPER
- SM数 48
- CUDAコア数 3072
GPUドライバ: NVIDIA 535.104.05
CUDA: 11.8
TensorRT: 8.6.1
cuDNN: 8.9.4
Python: 3.10.12

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
depthwiseConvolutionPlugin		depthwiseConvolutionPlugin
depthwiseTensorrt		depthwiseTensorrt
depthwiseWithPlugin		depthwiseWithPlugin
image		image
.gitignore		.gitignore
README.md		README.md
build_trt.sh		build_trt.sh
change_onnx.py		change_onnx.py
make_onnx.py		make_onnx.py
ncu_trt.sh		ncu_trt.sh
run_trt.sh		run_trt.sh
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TensorRT Custom Layer

使用したネットワーク

pluginのbuild方法

onnx graphの作成

onnx graphの変更

tensorRT engineのbuild

tensorRT engineの実行

test

result

実行環境

参考リンク

About

Releases

Packages

Languages

kterakura/TensorRT-CustomLayer

Folders and files

Latest commit

History

Repository files navigation

TensorRT Custom Layer

使用したネットワーク

pluginのbuild方法

onnx graphの作成

onnx graphの変更

tensorRT engineのbuild

tensorRT engineの実行

test

result

実行環境

参考リンク

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages