Unable to convert tensorrt SAM model to fp16 #160

Alex-mtnkv · 2024-12-07T12:53:08Z

Hi, I can't export the SAM decoder model in tensorrt in fp16. It crashes with the following error

[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Slice for ONNX node: /Slice
[12/07/2024-15:39:23] [V] [TRT] Registering tensor: /Slice_output_0 for ONNX tensor: /Slice_output_0
[12/07/2024-15:39:23] [V] [TRT] /Slice [Slice] outputs: [/Slice_output_0 -> (0)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Static check for parsing node: /Constant_35 [Constant]
[12/07/2024-15:39:23] [V] [TRT] Parsing node: /Constant_35 [Constant]
[12/07/2024-15:39:23] [V] [TRT] /Constant_35 [Constant] inputs: 
[12/07/2024-15:39:23] [V] [TRT] /Constant_35 [Constant] outputs: [/Constant_35_output_0 -> (1)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Static check for parsing node: /Concat_4 [Concat]
[12/07/2024-15:39:23] [V] [TRT] Parsing node: /Concat_4 [Concat]
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Slice_output_0
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Constant_35_output_0
[12/07/2024-15:39:23] [V] [TRT] /Concat_4 [Concat] inputs: [/Slice_output_0 -> (0)[INT64]], [/Constant_35_output_0 -> (1)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Constant_35_output_0 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Concat_4 for ONNX node: /Concat_4
[12/07/2024-15:39:23] [V] [TRT] Registering tensor: /Concat_4_output_0 for ONNX tensor: /Concat_4_output_0
[12/07/2024-15:39:23] [V] [TRT] /Concat_4 [Concat] outputs: [/Concat_4_output_0 -> (1)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Static check for parsing node: /Reshape_2 [Reshape]
[12/07/2024-15:39:23] [V] [TRT] Parsing node: /Reshape_2 [Reshape]
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /OneHot_output_0
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Concat_4_output_0
[12/07/2024-15:39:23] [V] [TRT] /Reshape_2 [Reshape] inputs: [/OneHot_output_0 -> (1, 5)[INT64]], [/Concat_4_output_0 -> (1)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Registering layer: ONNXTRT_ShapeShuffle_118 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Reshape_2 for ONNX node: /Reshape_2
[12/07/2024-15:39:23] [V] [TRT] Registering tensor: /Reshape_2_output_0 for ONNX tensor: /Reshape_2_output_0
[12/07/2024-15:39:23] [V] [TRT] /Reshape_2 [Reshape] outputs: [/Reshape_2_output_0 -> (5)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Static check for parsing node: /Tile [Tile]
[12/07/2024-15:39:23] [V] [TRT] Parsing node: /Tile [Tile]
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Unsqueeze_3_output_0
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Reshape_2_output_0
[12/07/2024-15:39:23] [V] [TRT] /Tile [Tile] inputs: [/Unsqueeze_3_output_0 -> (1, 1, 256, 64, 64)[FLOAT]], [/Reshape_2_output_0 -> (5)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Registering layer: ONNXTRT_ShapeTensorFromDims_120 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: ONNXTRT_ShapeElementWise_121 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: ONNXTRT_ShapeSlice_122 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Tile for ONNX node: /Tile
[12/07/2024-15:39:23] [E] Error[4]: ITensor::getDimensions: Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[12/07/2024-15:39:23] [E] [TRT] ModelImporter.cpp:946: While parsing node number 108 [Tile -> "/Tile_output_0"]:
[12/07/2024-15:39:23] [E] [TRT] ModelImporter.cpp:947: --- Begin node ---
input: "/Unsqueeze_3_output_0"
input: "/Reshape_2_output_0"
output: "/Tile_output_0"
name: "/Tile"
op_type: "Tile"

[12/07/2024-15:39:23] [E] [TRT] ModelImporter.cpp:948: --- End node ---
[12/07/2024-15:39:23] [E] [TRT] ModelImporter.cpp:951: ERROR: ModelImporter.cpp:197 In function parseNode:
[6] Invalid Node - /Tile
ITensor::getDimensions: Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[12/07/2024-15:39:23] [E] Failed to parse onnx file
[12/07/2024-15:39:23] [I] Finished parsing network model. Parse time: 0.0644058
[12/07/2024-15:39:23] [E] Parsing model failed
[12/07/2024-15:39:23] [E] Failed to create engine from model or file.
[12/07/2024-15:39:23] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100600] [b26] # /usr/src/tensorrt/bin/trtexec --onnx=assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx --minShapes=point_coords:1x1x2,point_labels:1x1 --optShapes=point_coords:16x2x2,point_labels:16x2 --maxShapes=point_coords:16x2x2,point_labels:16x2 --saveEngine=assets/export_models/efficientvit_sam/tensorrt/efficientvit_sam_xl1_decoder.engine --verbose

Environment:

Package                         Version
------------------------------- -------------------------------------------------------------
aiofiles                        23.2.1
annotated-types                 0.7.0
antlr4-python3-runtime          4.9.3
anyio                           4.7.0
apturl                          0.5.2
asttokens                       3.0.0
bcrypt                          3.2.0
blinker                         1.4
bokeh                           3.6.2
Brlapi                          0.8.3
certifi                         2020.6.20
cffi                            1.17.1
chardet                         4.0.0
click                           8.0.3
colorama                        0.4.4
colored                         2.2.4
coloredlogs                     15.0.1
command-not-found               0.3
contourpy                       1.3.1
cryptography                    3.4.8
cupshelpers                     1.0
cycler                          0.12.1
Cython                          3.0.11
dbus-python                     1.2.18
decorator                       5.1.1
defer                           1.0.6
diffusers                       0.31.0
distro                          1.7.0
distro-info                     1.1+ubuntu0.2
docker-pycreds                  0.4.0
duplicity                       0.8.21
einops                          0.8.0
exceptiongroup                  1.2.2
executing                       2.1.0
fastapi                         0.115.6
fasteners                       0.14.1
ffmpy                           0.4.0
filelock                        3.16.1
flatbuffers                     24.3.25
fonttools                       4.55.2
fsspec                          2024.10.0
future                          0.18.2
gitdb                           4.0.11
GitPython                       3.1.43
gradio                          4.44.1
gradio_box_promptable_image     0.0.1
gradio_clickable_arrow_dropdown 0.0.3
gradio_client                   1.3.0
gradio_point_promptable_image   0.0.3
gradio_sbmp_promptable_image    0.0.3
h11                             0.14.0
httpcore                        1.0.7
httplib2                        0.20.2
httpx                           0.28.1
huggingface-hub                 0.26.5
humanfriendly                   10.0
idna                            3.3
igraph                          0.11.8
imageio                         2.36.1
imageio-ffmpeg                  0.5.1
importlib-metadata              4.6.4
importlib_resources             6.4.5
ipdb                            0.13.13
ipython                         8.30.0
jedi                            0.19.2
jeepney                         0.7.1
Jinja2                          3.1.4
keyring                         23.5.0
kiwisolver                      1.4.7
language-selector               0.1
launchpadlib                    1.10.16
lazr.restfulclient              0.14.4
lazr.uri                        1.0.6
lazy_loader                     0.4
lightning-utilities             0.11.9
lockfile                        0.12.2
louis                           3.20.0
lvis                            0.5.3
macaroonbakery                  1.3.1
Mako                            1.1.3
Markdown                        3.3.6
markdown-it-py                  3.0.0
MarkupSafe                      2.0.1
matplotlib                      3.9.3
matplotlib-inline               0.1.7
mdurl                           0.1.2
monotonic                       1.6
more-itertools                  8.10.0
moviepy                         2.1.1
mpmath                          1.3.0
netifaces                       0.11.0
networkx                        3.4.2
numpy                           1.26.4
nvidia-cublas-cu12              12.4.5.8
nvidia-cuda-cupti-cu12          12.4.127
nvidia-cuda-nvrtc-cu12          12.4.127
nvidia-cuda-runtime-cu12        12.4.127
nvidia-cudnn-cu12               9.1.0.70
nvidia-cufft-cu12               11.2.1.3
nvidia-curand-cu12              10.3.5.147
nvidia-cusolver-cu12            11.6.1.9
nvidia-cusparse-cu12            12.3.1.170
nvidia-nccl-cu12                2.21.5
nvidia-nvjitlink-cu12           12.4.127
nvidia-nvtx-cu12                12.4.127
oauthlib                        3.2.0
olefile                         0.46
omegaconf                       2.3.0
onnx                            1.17.0
onnx-graphsurgeon               0.5.2
onnxruntime                     1.20.1
onnxruntime-gpu                 1.20.1
onnxsim                         0.4.36
opencv-python                   4.10.0.84
orjson                          3.10.12
packaging                       24.2
pandas                          2.2.3
paramiko                        2.9.3
parso                           0.8.4
pexpect                         4.8.0
pillow                          10.4.0
pip                             24.3.1
platformdirs                    4.3.6
plotly                          5.24.1
polygraphy                      0.49.9
proglog                         0.1.10
prompt_toolkit                  3.0.48
protobuf                        5.29.1
psutil                          6.1.0
ptyprocess                      0.7.0
pure_eval                       0.2.3
pycairo                         1.20.1
pycocotools                     2.0.8
pycparser                       2.22
pycups                          2.0.1
pydantic                        2.10.3
pydantic_core                   2.27.1
pydub                           0.25.1
Pygments                        2.18.0
PyGObject                       3.42.1
PyJWT                           2.3.0
pymacaroons                     0.13.0
PyNaCl                          1.5.0
pyparsing                       2.4.7
pyRFC3339                       1.1
python-apt                      2.4.0+ubuntu4
python-dateutil                 2.9.0.post0
python-debian                   0.1.43+ubuntu1.1
python-dotenv                   1.0.1
python-multipart                0.0.19
pytz                            2022.1
pyxdg                           0.27
PyYAML                          5.4.1
rdkit                           2024.3.6
regex                           2024.11.6
reportlab                       3.6.8
requests                        2.25.1
rich                            13.9.4
ruamel.yaml                     0.18.6
ruamel.yaml.clib                0.2.12
ruff                            0.8.2
safetensors                     0.4.5
scikit-image                    0.24.0
scipy                           1.14.1
screen-resolution-extra         0.0.0
SecretStorage                   3.3.1
segment-anything                1.0
semantic-version                2.10.0
sentry-sdk                      2.19.2
setproctitle                    1.3.4
setuptools                      59.6.0
shellingham                     1.5.4
six                             1.16.0
smmap                           5.0.1
sniffio                         1.3.1
soundfile                       0.12.1
stack-data                      0.6.3
starlette                       0.41.3
sympy                           1.13.1
systemd-python                  234
tenacity                        9.0.0
tensorrt                        10.6.0
tensorrt-dispatch               10.6.0
tensorrt-lean                   10.6.0
texttable                       1.7.0
tifffile                        2024.9.20
timm                            1.0.12
TinyNeuralNetwork               0.1.0.20241202154922+f79b0ccf02a92247c9cae4ac403c33917f8f6f6f
tokenizers                      0.21.0
tomli                           2.2.1
tomlkit                         0.12.0
torch                           2.5.1
torch-fidelity                  0.3.0
torchmetrics                    1.6.0
torchprofile                    0.0.4
torchvision                     0.20.1
tornado                         6.4.2
tqdm                            4.67.1
traitlets                       5.14.3
transformers                    4.47.0
triton                          3.1.0
typer                           0.15.1
typing_extensions               4.12.2
tzdata                          2024.2
ubuntu-drivers-common           0.0.0
ubuntu-pro-client               8001
ufw                             0.36.1
unattended-upgrades             0.1
urllib3                         2.2.3
usb-creator                     0.3.7
uvicorn                         0.32.1
wadllib                         1.3.6
wandb                           0.19.0
wcwidth                         0.2.13
websockets                      12.0
wheel                           0.37.1
xdg                             5
xkit                            0.0.0
xyzservices                     2024.9.0
zipp                            1.0.0

I tried polygraphy, it didn't help.

> polygraphy surgeon sanitize assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx --fold-constants -o assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder_folded.onnx --fold-size-threshold 64
[I] RUNNING | Command: /home/mrgreen/.local/bin/polygraphy surgeon sanitize assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx --fold-constants -o assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder_folded.onnx --fold-size-threshold 64
[I] Loading model: /home/mrgreen/sam_projects/efficientvit/assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx
[I] Original Model:
    Name: main_graph | ONNX Opset: 17
    
    ---- 3 Graph Input(s) ----
    {image_embeddings [dtype=float32, shape=(1, 256, 64, 64)],
     point_coords [dtype=float32, shape=('batch_size', 'num_points', 2)],
     point_labels [dtype=float32, shape=('batch_size', 'num_points')]}
    
    ---- 2 Graph Output(s) ----
    {masks [dtype=float32, shape=('batch_size', 1, 256, 256)],
     iou_predictions [dtype=float32, shape=('batch_size', 1)]}
    
    ---- 126 Initializer(s) ----
    
    ---- 1107 Node(s) ----
    
[I] Folding Constants | Pass 1
2024-12-07 15:51:37.772682788 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_18
2024-12-07 15:51:37.772701115 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_17
2024-12-07 15:51:37.772705957 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_14
2024-12-07 15:51:37.772710139 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_8
2024-12-07 15:51:37.772713635 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_7
2024-12-07 15:51:37.772717199 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_6
2024-12-07 15:51:37.772724222 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/final_attn_token_to_image/Unsqueeze_7
2024-12-07 15:51:37.772729173 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/final_attn_token_to_image/Unsqueeze_4
2024-12-07 15:51:37.772734137 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.1/cross_attn_image_to_token/Unsqueeze_10
2024-12-07 15:51:37.772740492 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.1/cross_attn_image_to_token/Unsqueeze_1
2024-12-07 15:51:37.772745777 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.1/cross_attn_token_to_image/Unsqueeze_7
2024-12-07 15:51:37.772750382 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.1/cross_attn_token_to_image/Unsqueeze_4
2024-12-07 15:51:37.772757770 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.0/cross_attn_image_to_token/Unsqueeze_10
2024-12-07 15:51:37.772764524 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.0/cross_attn_image_to_token/Unsqueeze_1
2024-12-07 15:51:37.772769314 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.0/cross_attn_token_to_image/Unsqueeze_7
2024-12-07 15:51:37.772773669 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.0/cross_attn_token_to_image/Unsqueeze_4
[I]     Total Nodes | Original:  1107, After Folding:   522 |   585 Nodes Folded
[I] Folding Constants | Pass 2
[I]     Total Nodes | Original:   522, After Folding:   522 |     0 Nodes Folded
[I] Saving ONNX model to: assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder_folded.onnx
[I] New Model:
    Name: main_graph | ONNX Opset: 17
    
    ---- 3 Graph Input(s) ----
    {image_embeddings [dtype=float32, shape=(1, 256, 64, 64)],
     point_coords [dtype=float32, shape=('batch_size', 'num_points', 2)],
     point_labels [dtype=float32, shape=('batch_size', 'num_points')]}
    
    ---- 2 Graph Output(s) ----
    {masks [dtype=float32, shape=('batch_size', 1, 256, 256)],
     iou_predictions [dtype=float32, shape=('batch_size', 1)]}
    
    ---- 381 Initializer(s) ----
    
    ---- 522 Node(s) ----
    
[I] PASSED | Runtime: 1.544s | Command: /home/mrgreen/.local/bin/polygraphy surgeon sanitize assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx --fold-constants -o assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder_folded.onnx --fold-size-threshold 64

The text was updated successfully, but these errors were encountered:

Alex-mtnkv · 2024-12-07T14:15:25Z

In TensorRT==10.7.0 not problem/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to convert tensorrt SAM model to fp16 #160

Unable to convert tensorrt SAM model to fp16 #160

Alex-mtnkv commented Dec 7, 2024

Alex-mtnkv commented Dec 7, 2024

Unable to convert tensorrt SAM model to fp16 #160

Unable to convert tensorrt SAM model to fp16 #160

Comments

Alex-mtnkv commented Dec 7, 2024

Environment:

Alex-mtnkv commented Dec 7, 2024