-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TRT support for MAISI #8153
base: dev
Are you sure you want to change the base?
TRT support for MAISI #8153
Conversation
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
…ppers Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Also, I did not do any results verification. If any results depend on Meta tensors operation, that part may be lost. Please check! |
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
Dockerfile
Outdated
@@ -11,7 +11,7 @@ | |||
|
|||
# To build with a different base image | |||
# please run `docker build` using the `--build-arg PYTORCH_IMAGE=...` flag. | |||
ARG PYTORCH_IMAGE=nvcr.io/nvidia/pytorch:24.08-py3 | |||
ARG PYTORCH_IMAGE=nvcr.io/nvidia/pytorch:24.09-py3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need more test for this base image update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it does not make real difference (patch I mentioned in the description is needed for 24.09 anyway), so I may revert this one for now, too. 24.10 (and 2.5.0) won't require exporter patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we'll need to update to version 24.10 once it's released, since 24.09 still doesn't meet the requirements, and MAISI still lacks TRT support?
I try to update the base image and trigger more test in this PR #8164, shown an error below:
#8164 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I believe it's better to skip 24.09 as it still requires a patch.
monai/networks/utils.py
Outdated
@@ -693,7 +695,7 @@ def convert_to_onnx( | |||
f = io.BytesIO() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also modify this part based on the latest api from torch.onnx.export
? Thanks!
#8149 (comment)
Hi @binliunls, please also review the trt related parts in this PR, thanks. |
monai/networks/trt_compiler.py
Outdated
@@ -255,6 +345,7 @@ def __init__( | |||
'torch_trt' may not work for some nets. Also AMP must be turned off for it to work. | |||
input_names: Optional list of input names. If None, will be read from the function signature. | |||
output_names: Optional list of output names. Note: If not None, patched forward() will return a dictionary. | |||
output_lists: Optional list of output lists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add more details about this parameter and the relation between this one and the output_names? Now it's hard to understand the meaning of it.
Thanks,
Bin
@@ -233,13 +321,15 @@ def __init__( | |||
method="onnx", | |||
input_names=None, | |||
output_names=None, | |||
output_lists=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also add a simple test case to the unit test to show case how to use this parameter?
Thanks,
Bin
self._build_and_save(model, build_args) | ||
# This will reassign input_names from the engine | ||
build_args = args.copy() | ||
with torch.no_grad(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask the reason for adding the torch.no_grad()
here? Was it caused some issues in the previous version?
Thanks,
Bin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes there were some issues with export. As TRT is inference-only, it makes sense to do the whole export with torch.no_grad() - this is the recommended way.
@@ -180,7 +184,8 @@ def try_set_inputs(): | |||
raise | |||
self.cur_profile = next_profile | |||
ctx.set_optimization_profile_async(self.cur_profile, stream) | |||
|
|||
except Exception: | |||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add more info to explain this exception?
Thanks,
Bin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be exception trying to set input shapes for which the engine was not built ; previously I had a logic there that would try rotating trt optimization profile index on such an exception - we do not use multiple profiles with MONAI so I should probably simplify the code.
# Simulate list/tuple unrolling during ONNX export | ||
unrolled_input = {} | ||
for name in input_names: | ||
val = input_example[name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think input_example.get(name, None)
is a better choice here, in case there are any illegal keys.
Thanks,
Bin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can look more into making this robust for the odd cases.
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
@@ -41,6 +41,10 @@ RUN cp /tmp/requirements.txt /tmp/req.bak \ | |||
COPY LICENSE CHANGELOG.md CODE_OF_CONDUCT.md CONTRIBUTING.md README.md versioneer.py setup.py setup.cfg runtests.sh MANIFEST.in ./ | |||
COPY tests ./tests | |||
COPY monai ./monai | |||
|
|||
# TODO: remove this line and torch.patch for 24.11 | |||
RUN patch -R -d /usr/local/lib/python3.10/dist-packages/torch/onnx/ < ./monai/torch.patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the patch not included in 24.10, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proper fix is not included with 24.10, yes, so we have to patch.
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Boris Fomitchev <[email protected]>
for more information, see https://pre-commit.ci
Description
Added trt_compile() support for Lists and Tuples in arguments for forward() - needed for MAISI.
Did not add support for grouping return results yet - MAISI worked with explicit workaround unrolling the return results.
Notes
To successfully export MAISI, either latest Torch nightly is needed, or this patch needs to be applied to 24.09-based container:
Types of changes