Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to convert tensorrt SAM model to fp16 #160

Open
Alex-mtnkv opened this issue Dec 7, 2024 · 1 comment
Open

Unable to convert tensorrt SAM model to fp16 #160

Alex-mtnkv opened this issue Dec 7, 2024 · 1 comment

Comments

@Alex-mtnkv
Copy link

Hi, I can't export the SAM decoder model in tensorrt in fp16. It crashes with the following error

[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Slice for ONNX node: /Slice
[12/07/2024-15:39:23] [V] [TRT] Registering tensor: /Slice_output_0 for ONNX tensor: /Slice_output_0
[12/07/2024-15:39:23] [V] [TRT] /Slice [Slice] outputs: [/Slice_output_0 -> (0)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Static check for parsing node: /Constant_35 [Constant]
[12/07/2024-15:39:23] [V] [TRT] Parsing node: /Constant_35 [Constant]
[12/07/2024-15:39:23] [V] [TRT] /Constant_35 [Constant] inputs: 
[12/07/2024-15:39:23] [V] [TRT] /Constant_35 [Constant] outputs: [/Constant_35_output_0 -> (1)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Static check for parsing node: /Concat_4 [Concat]
[12/07/2024-15:39:23] [V] [TRT] Parsing node: /Concat_4 [Concat]
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Slice_output_0
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Constant_35_output_0
[12/07/2024-15:39:23] [V] [TRT] /Concat_4 [Concat] inputs: [/Slice_output_0 -> (0)[INT64]], [/Constant_35_output_0 -> (1)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Constant_35_output_0 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Concat_4 for ONNX node: /Concat_4
[12/07/2024-15:39:23] [V] [TRT] Registering tensor: /Concat_4_output_0 for ONNX tensor: /Concat_4_output_0
[12/07/2024-15:39:23] [V] [TRT] /Concat_4 [Concat] outputs: [/Concat_4_output_0 -> (1)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Static check for parsing node: /Reshape_2 [Reshape]
[12/07/2024-15:39:23] [V] [TRT] Parsing node: /Reshape_2 [Reshape]
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /OneHot_output_0
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Concat_4_output_0
[12/07/2024-15:39:23] [V] [TRT] /Reshape_2 [Reshape] inputs: [/OneHot_output_0 -> (1, 5)[INT64]], [/Concat_4_output_0 -> (1)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Registering layer: ONNXTRT_ShapeShuffle_118 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Reshape_2 for ONNX node: /Reshape_2
[12/07/2024-15:39:23] [V] [TRT] Registering tensor: /Reshape_2_output_0 for ONNX tensor: /Reshape_2_output_0
[12/07/2024-15:39:23] [V] [TRT] /Reshape_2 [Reshape] outputs: [/Reshape_2_output_0 -> (5)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Static check for parsing node: /Tile [Tile]
[12/07/2024-15:39:23] [V] [TRT] Parsing node: /Tile [Tile]
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Unsqueeze_3_output_0
[12/07/2024-15:39:23] [V] [TRT] Searching for input: /Reshape_2_output_0
[12/07/2024-15:39:23] [V] [TRT] /Tile [Tile] inputs: [/Unsqueeze_3_output_0 -> (1, 1, 256, 64, 64)[FLOAT]], [/Reshape_2_output_0 -> (5)[INT64]], 
[12/07/2024-15:39:23] [V] [TRT] Registering layer: ONNXTRT_ShapeTensorFromDims_120 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: ONNXTRT_ShapeElementWise_121 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: ONNXTRT_ShapeSlice_122 required by ONNX-TRT
[12/07/2024-15:39:23] [V] [TRT] Registering layer: /Tile for ONNX node: /Tile
[12/07/2024-15:39:23] [E] Error[4]: ITensor::getDimensions: Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[12/07/2024-15:39:23] [E] [TRT] ModelImporter.cpp:946: While parsing node number 108 [Tile -> "/Tile_output_0"]:
[12/07/2024-15:39:23] [E] [TRT] ModelImporter.cpp:947: --- Begin node ---
input: "/Unsqueeze_3_output_0"
input: "/Reshape_2_output_0"
output: "/Tile_output_0"
name: "/Tile"
op_type: "Tile"

[12/07/2024-15:39:23] [E] [TRT] ModelImporter.cpp:948: --- End node ---
[12/07/2024-15:39:23] [E] [TRT] ModelImporter.cpp:951: ERROR: ModelImporter.cpp:197 In function parseNode:
[6] Invalid Node - /Tile
ITensor::getDimensions: Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[12/07/2024-15:39:23] [E] Failed to parse onnx file
[12/07/2024-15:39:23] [I] Finished parsing network model. Parse time: 0.0644058
[12/07/2024-15:39:23] [E] Parsing model failed
[12/07/2024-15:39:23] [E] Failed to create engine from model or file.
[12/07/2024-15:39:23] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100600] [b26] # /usr/src/tensorrt/bin/trtexec --onnx=assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx --minShapes=point_coords:1x1x2,point_labels:1x1 --optShapes=point_coords:16x2x2,point_labels:16x2 --maxShapes=point_coords:16x2x2,point_labels:16x2 --saveEngine=assets/export_models/efficientvit_sam/tensorrt/efficientvit_sam_xl1_decoder.engine --verbose

Environment:

Package                         Version
------------------------------- -------------------------------------------------------------
aiofiles                        23.2.1
annotated-types                 0.7.0
antlr4-python3-runtime          4.9.3
anyio                           4.7.0
apturl                          0.5.2
asttokens                       3.0.0
bcrypt                          3.2.0
blinker                         1.4
bokeh                           3.6.2
Brlapi                          0.8.3
certifi                         2020.6.20
cffi                            1.17.1
chardet                         4.0.0
click                           8.0.3
colorama                        0.4.4
colored                         2.2.4
coloredlogs                     15.0.1
command-not-found               0.3
contourpy                       1.3.1
cryptography                    3.4.8
cupshelpers                     1.0
cycler                          0.12.1
Cython                          3.0.11
dbus-python                     1.2.18
decorator                       5.1.1
defer                           1.0.6
diffusers                       0.31.0
distro                          1.7.0
distro-info                     1.1+ubuntu0.2
docker-pycreds                  0.4.0
duplicity                       0.8.21
einops                          0.8.0
exceptiongroup                  1.2.2
executing                       2.1.0
fastapi                         0.115.6
fasteners                       0.14.1
ffmpy                           0.4.0
filelock                        3.16.1
flatbuffers                     24.3.25
fonttools                       4.55.2
fsspec                          2024.10.0
future                          0.18.2
gitdb                           4.0.11
GitPython                       3.1.43
gradio                          4.44.1
gradio_box_promptable_image     0.0.1
gradio_clickable_arrow_dropdown 0.0.3
gradio_client                   1.3.0
gradio_point_promptable_image   0.0.3
gradio_sbmp_promptable_image    0.0.3
h11                             0.14.0
httpcore                        1.0.7
httplib2                        0.20.2
httpx                           0.28.1
huggingface-hub                 0.26.5
humanfriendly                   10.0
idna                            3.3
igraph                          0.11.8
imageio                         2.36.1
imageio-ffmpeg                  0.5.1
importlib-metadata              4.6.4
importlib_resources             6.4.5
ipdb                            0.13.13
ipython                         8.30.0
jedi                            0.19.2
jeepney                         0.7.1
Jinja2                          3.1.4
keyring                         23.5.0
kiwisolver                      1.4.7
language-selector               0.1
launchpadlib                    1.10.16
lazr.restfulclient              0.14.4
lazr.uri                        1.0.6
lazy_loader                     0.4
lightning-utilities             0.11.9
lockfile                        0.12.2
louis                           3.20.0
lvis                            0.5.3
macaroonbakery                  1.3.1
Mako                            1.1.3
Markdown                        3.3.6
markdown-it-py                  3.0.0
MarkupSafe                      2.0.1
matplotlib                      3.9.3
matplotlib-inline               0.1.7
mdurl                           0.1.2
monotonic                       1.6
more-itertools                  8.10.0
moviepy                         2.1.1
mpmath                          1.3.0
netifaces                       0.11.0
networkx                        3.4.2
numpy                           1.26.4
nvidia-cublas-cu12              12.4.5.8
nvidia-cuda-cupti-cu12          12.4.127
nvidia-cuda-nvrtc-cu12          12.4.127
nvidia-cuda-runtime-cu12        12.4.127
nvidia-cudnn-cu12               9.1.0.70
nvidia-cufft-cu12               11.2.1.3
nvidia-curand-cu12              10.3.5.147
nvidia-cusolver-cu12            11.6.1.9
nvidia-cusparse-cu12            12.3.1.170
nvidia-nccl-cu12                2.21.5
nvidia-nvjitlink-cu12           12.4.127
nvidia-nvtx-cu12                12.4.127
oauthlib                        3.2.0
olefile                         0.46
omegaconf                       2.3.0
onnx                            1.17.0
onnx-graphsurgeon               0.5.2
onnxruntime                     1.20.1
onnxruntime-gpu                 1.20.1
onnxsim                         0.4.36
opencv-python                   4.10.0.84
orjson                          3.10.12
packaging                       24.2
pandas                          2.2.3
paramiko                        2.9.3
parso                           0.8.4
pexpect                         4.8.0
pillow                          10.4.0
pip                             24.3.1
platformdirs                    4.3.6
plotly                          5.24.1
polygraphy                      0.49.9
proglog                         0.1.10
prompt_toolkit                  3.0.48
protobuf                        5.29.1
psutil                          6.1.0
ptyprocess                      0.7.0
pure_eval                       0.2.3
pycairo                         1.20.1
pycocotools                     2.0.8
pycparser                       2.22
pycups                          2.0.1
pydantic                        2.10.3
pydantic_core                   2.27.1
pydub                           0.25.1
Pygments                        2.18.0
PyGObject                       3.42.1
PyJWT                           2.3.0
pymacaroons                     0.13.0
PyNaCl                          1.5.0
pyparsing                       2.4.7
pyRFC3339                       1.1
python-apt                      2.4.0+ubuntu4
python-dateutil                 2.9.0.post0
python-debian                   0.1.43+ubuntu1.1
python-dotenv                   1.0.1
python-multipart                0.0.19
pytz                            2022.1
pyxdg                           0.27
PyYAML                          5.4.1
rdkit                           2024.3.6
regex                           2024.11.6
reportlab                       3.6.8
requests                        2.25.1
rich                            13.9.4
ruamel.yaml                     0.18.6
ruamel.yaml.clib                0.2.12
ruff                            0.8.2
safetensors                     0.4.5
scikit-image                    0.24.0
scipy                           1.14.1
screen-resolution-extra         0.0.0
SecretStorage                   3.3.1
segment-anything                1.0
semantic-version                2.10.0
sentry-sdk                      2.19.2
setproctitle                    1.3.4
setuptools                      59.6.0
shellingham                     1.5.4
six                             1.16.0
smmap                           5.0.1
sniffio                         1.3.1
soundfile                       0.12.1
stack-data                      0.6.3
starlette                       0.41.3
sympy                           1.13.1
systemd-python                  234
tenacity                        9.0.0
tensorrt                        10.6.0
tensorrt-dispatch               10.6.0
tensorrt-lean                   10.6.0
texttable                       1.7.0
tifffile                        2024.9.20
timm                            1.0.12
TinyNeuralNetwork               0.1.0.20241202154922+f79b0ccf02a92247c9cae4ac403c33917f8f6f6f
tokenizers                      0.21.0
tomli                           2.2.1
tomlkit                         0.12.0
torch                           2.5.1
torch-fidelity                  0.3.0
torchmetrics                    1.6.0
torchprofile                    0.0.4
torchvision                     0.20.1
tornado                         6.4.2
tqdm                            4.67.1
traitlets                       5.14.3
transformers                    4.47.0
triton                          3.1.0
typer                           0.15.1
typing_extensions               4.12.2
tzdata                          2024.2
ubuntu-drivers-common           0.0.0
ubuntu-pro-client               8001
ufw                             0.36.1
unattended-upgrades             0.1
urllib3                         2.2.3
usb-creator                     0.3.7
uvicorn                         0.32.1
wadllib                         1.3.6
wandb                           0.19.0
wcwidth                         0.2.13
websockets                      12.0
wheel                           0.37.1
xdg                             5
xkit                            0.0.0
xyzservices                     2024.9.0
zipp                            1.0.0

I tried polygraphy, it didn't help.

> polygraphy surgeon sanitize assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx --fold-constants -o assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder_folded.onnx --fold-size-threshold 64
[I] RUNNING | Command: /home/mrgreen/.local/bin/polygraphy surgeon sanitize assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx --fold-constants -o assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder_folded.onnx --fold-size-threshold 64
[I] Loading model: /home/mrgreen/sam_projects/efficientvit/assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx
[I] Original Model:
    Name: main_graph | ONNX Opset: 17
    
    ---- 3 Graph Input(s) ----
    {image_embeddings [dtype=float32, shape=(1, 256, 64, 64)],
     point_coords [dtype=float32, shape=('batch_size', 'num_points', 2)],
     point_labels [dtype=float32, shape=('batch_size', 'num_points')]}
    
    ---- 2 Graph Output(s) ----
    {masks [dtype=float32, shape=('batch_size', 1, 256, 256)],
     iou_predictions [dtype=float32, shape=('batch_size', 1)]}
    
    ---- 126 Initializer(s) ----
    
    ---- 1107 Node(s) ----
    
[I] Folding Constants | Pass 1
2024-12-07 15:51:37.772682788 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_18
2024-12-07 15:51:37.772701115 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_17
2024-12-07 15:51:37.772705957 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_14
2024-12-07 15:51:37.772710139 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_8
2024-12-07 15:51:37.772713635 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_7
2024-12-07 15:51:37.772717199 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /Unsqueeze_6
2024-12-07 15:51:37.772724222 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/final_attn_token_to_image/Unsqueeze_7
2024-12-07 15:51:37.772729173 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/final_attn_token_to_image/Unsqueeze_4
2024-12-07 15:51:37.772734137 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.1/cross_attn_image_to_token/Unsqueeze_10
2024-12-07 15:51:37.772740492 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.1/cross_attn_image_to_token/Unsqueeze_1
2024-12-07 15:51:37.772745777 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.1/cross_attn_token_to_image/Unsqueeze_7
2024-12-07 15:51:37.772750382 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.1/cross_attn_token_to_image/Unsqueeze_4
2024-12-07 15:51:37.772757770 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.0/cross_attn_image_to_token/Unsqueeze_10
2024-12-07 15:51:37.772764524 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.0/cross_attn_image_to_token/Unsqueeze_1
2024-12-07 15:51:37.772769314 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.0/cross_attn_token_to_image/Unsqueeze_7
2024-12-07 15:51:37.772773669 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node /transformer/layers.0/cross_attn_token_to_image/Unsqueeze_4
[I]     Total Nodes | Original:  1107, After Folding:   522 |   585 Nodes Folded
[I] Folding Constants | Pass 2
[I]     Total Nodes | Original:   522, After Folding:   522 |     0 Nodes Folded
[I] Saving ONNX model to: assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder_folded.onnx
[I] New Model:
    Name: main_graph | ONNX Opset: 17
    
    ---- 3 Graph Input(s) ----
    {image_embeddings [dtype=float32, shape=(1, 256, 64, 64)],
     point_coords [dtype=float32, shape=('batch_size', 'num_points', 2)],
     point_labels [dtype=float32, shape=('batch_size', 'num_points')]}
    
    ---- 2 Graph Output(s) ----
    {masks [dtype=float32, shape=('batch_size', 1, 256, 256)],
     iou_predictions [dtype=float32, shape=('batch_size', 1)]}
    
    ---- 381 Initializer(s) ----
    
    ---- 522 Node(s) ----
    
[I] PASSED | Runtime: 1.544s | Command: /home/mrgreen/.local/bin/polygraphy surgeon sanitize assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder.onnx --fold-constants -o assets/export_models/efficientvit_sam/onnx/efficientvit_sam_xl1_decoder_folded.onnx --fold-size-threshold 64
@Alex-mtnkv
Copy link
Author

In TensorRT==10.7.0 not problem/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant