Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error runing the tensorrt optimization #19

Open
omartin2010 opened this issue Dec 28, 2019 · 3 comments
Open

error runing the tensorrt optimization #19

omartin2010 opened this issue Dec 28, 2019 · 3 comments

Comments

@omartin2010
Copy link

Did anyone run into this kind of error (I've only kept the latest from the output... it's pretty long) :

2019-12-27 23:28:09.856472: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:460] There are 1849 ops of 28 different types in the graph that are not converted to TensorRT: Fill, Merge, Switch, Range, ConcatV2, ZerosLike, Identity, NonMaxSuppressionV3, Squeeze, Mul, ExpandDims, Unpack, TopKV2, Cast, Transpose, Placeholder, Sub, Const, Greater, Shape, Where, Reshape, NoOp, GatherV2, AddV2, Pack, Minimum, StridedSlice, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2019-12-27 23:28:11.128378: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:633] Number of TensorRT candidate segments: 2
2019-12-27 23:28:11.468622: F tensorflow/core/util/device_name_utils.cc:92] Check failed: IsJobName(job) 
Aborted (core dumped)

I'm running this on a Jetson TX2 with the latest jetpack (as of now, it is 4.3). tensorflow-gpu 1.15.0+nv19.12.tf1. Would that cause an issue with this?

Any clue what might be causing this?

@omartin2010
Copy link
Author

omartin2010 commented Dec 30, 2019

I've checked and it happens when running specifically this line :

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 25,
    precision_mode='FP16',
    minimum_segment_size=50
)

in the python script... if that helps.
Found this might help :
https://jkjung-avt.github.io/jetpack-4.3/

Looks like a lot of efforts to get this thing running properly for me.. given I'm running TF15 (which is the wheel provided by nvidia) it seems like too much work to get this to work for my requirement (a personal project... no need to be ultra fast). I'll see if I can instead use another installation of TF1.13 provided in the 2-day example...

@tekeburak
Copy link

Hi @omartin2010,
Did you solve the problem?
I'm facing with the same issue though. Your help will be very appreciated @SteveMacenski.

System info:

Jetpack 4.4.1 [L4T 32.4.4]
tensorflow-1.15.4+nv20.12-cp36-cp36m-linux_aarch64.whl

Error:

Creating Jetson optimized graph...
2020-12-23 14:23:39.825571: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer.so.7
2020-12-23 14:24:06.363306: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.363477: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2020-12-23 14:24:06.363784: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2020-12-23 14:24:06.364686: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.364814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties:
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-12-23 14:24:06.364889: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2020-12-23 14:24:06.364978: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2020-12-23 14:24:06.365038: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2020-12-23 14:24:06.365090: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2020-12-23 14:24:06.365139: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2020-12-23 14:24:06.365186: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2020-12-23 14:24:06.365229: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2020-12-23 14:24:06.365360: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.365513: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.365579: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Adding visible gpu devices: 0
2020-12-23 14:24:06.365647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-23 14:24:06.365681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212]      0
2020-12-23 14:24:06.365707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0:   N
2020-12-23 14:24:06.365854: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.366024: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.366123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1284 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2020-12-23 14:24:24.278440: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:486] There are 1850 ops of 29 different types in the graph that are not converted to TensorRT: Fill, Merge, Switch, Range, ConcatV2, ZerosLike, Identity, NonMaxSuppressionV3, Minimum, StridedSlice, ExpandDims, Unpack, TopKV2, Cast, Transpose, Placeholder, ResizeBilinear, Squeeze, Mul, Sub, Const, Greater, Shape, Where, Reshape, NoOp, GatherV2, AddV2, Pack, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2020-12-23 14:24:25.465888: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:647] Number of TensorRT candidate segments: 2
2020-12-23 14:24:25.630790: F tensorflow/core/util/device_name_utils.cc:92] Check failed: IsJobName(job)
Aborted (core dumped)

@tekeburak
Copy link

I solved the problem. These code changes need to be applied in tf_download_and_trt_model.py

diff --git a/tf_download_and_trt_model.py b/tf_download_and_trt_model.py
index c5e608c..083f746 100644
--- a/tf_download_and_trt_model.py
+++ b/tf_download_and_trt_model.py
@@ -1,4 +1,4 @@
-import tensorflow.contrib.tensorrt as trt
+from tensorflow.python.compiler.tensorrt import trt_convert as trt
 import sys
 import os
 try:
@@ -19,6 +19,7 @@ print ("Building detection graph from model " + MODEL + "...")
 frozen_graph, input_names, output_names = build_detection_graph(
     config=config_path,
     checkpoint=checkpoint_path,
+    force_nms_cpu=False,
     score_threshold=0.3,
     #iou_threshold=0.5,
     batch_size=1

Please refer to these issues for more details.
tensorflow/tensorrt#197
tensorflow/tensorrt#107

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants