-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to use tensorrt to speed up original tensorflow t5 exported saved_model? #306
Comments
I no longer work in TF |
sorry, my fault. |
never mind, i found solution. |
The original saved_model tooks 300ms when batch_size=32 and sen_length=128, it's too long for deploy. So I wanted to speed up t5 by using tf-trt. But when I convert saved_model using below code, tf-trt doesn't work:
Before using the code, you should add some code in tensorflow/python/compiler/tensorrt/trt_convert.py. The reference is here |
i've tried huggingface t5 model speed up by trt, but how can we speed up tensorflow t5 saved_model?
i want to use speed-up t5 saved_model in tf-serving for production env.
my envirment is:
i followed the tf-trt-user-guide, but it's not work.
i first use code:
it's filed when i use
tf.saved_model.load
, the error message isThen i found t5 saved_model was export by tf1, the i use tf.compat.v1 to convert, code:
still faild.
Could someone can tell: can we use trt to convert tf-t5-saved_model ? If it's possible, how?
@DEKHTIARJonathan
The text was updated successfully, but these errors were encountered: