Is it possible to use tensorrt to speed up original tensorflow t5 exported saved_model? #306

chenl-chief · 2022-07-19T09:48:33Z

i've tried huggingface t5 model speed up by trt, but how can we speed up tensorflow t5 saved_model?
i want to use speed-up t5 saved_model in tf-serving for production env.
my envirment is:

docker image: nvcr.io/nvidia/tensorflow:22.05-tf2-py3
GPU: Tesla V100 * 2

i followed the tf-trt-user-guide, but it's not work.
i first use code:

from tensorflow.python.compiler.tensorrt import trt_convert as trt
import numpy as np
import tensorflow_text

SAVED_MODEL_DIR = '/path/to/t5/export/saved_model'
output_saved_model_dir = '/path/to/save/trt/saved_model'

conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(
    precision_mode=trt.TrtPrecisionMode.FP16)
converter = trt.TrtGraphConverterV2(
    input_saved_model_dir=SAVED_MODEL_DIR,
    conversion_params=conversion_params)

converter.convert()
converter.save(output_saved_model_dir)

it's filed when i use tf.saved_model.load, the error message is

"FAILED_PRECONDITION: Attempting to use uninitialized value"

Then i found t5 saved_model was export by tf1, the i use tf.compat.v1 to convert, code:

from tensorflow.python.compiler.tensorrt import trt_convert as trt
import numpy as np
import tensorflow_text
import tensorflow as tf

tf.compat.v1.disable_v2_behavior()

input_saved_model_dir = '/path/to/t5/export/saved_model'
output_saved_model_dir = '/path/to/save/trt/saved_model'
converter = trt.TrtGraphConverter(
    input_saved_model_dir=input_saved_model_dir,
    max_workspace_size_bytes=(11<32),
    precision_mode='FP16',
    maximum_cached_engines=100)

converter.convert()
converter.save(output_saved_model_dir)

still faild.

ValueError: Input 0 of node decoder/block_011/layer_002/rms_norm/scale_1/parallel_0_1/Assign was passed float from decoder/block_011/layer_002/rms_norm/scale_slice_0:0 incompatible with expected float_ref.

Could someone can tell: can we use trt to convert tf-t5-saved_model ? If it's possible, how?
@DEKHTIARJonathan

The text was updated successfully, but these errors were encountered:

mihaimaruseac · 2022-07-19T16:09:22Z

I no longer work in TF

chenl-chief · 2022-07-20T02:27:18Z

I no longer work in TF

sorry, my fault.

chenl-chief · 2022-07-20T09:17:55Z

never mind, i found solution.

chenl-chief · 2022-08-11T07:29:45Z

The original saved_model tooks 300ms when batch_size=32 and sen_length=128, it's too long for deploy. So I wanted to speed up t5 by using tf-trt. But when I convert saved_model using below code, tf-trt doesn't work:

from tensorflow.python.compiler.tensorrt import trt_convert as trt
import numpy as np
import tensorflow_text
import tensorflow as tf

tf.compat.v1.disable_v2_behavior()

input_saved_model_dir = 'exported_model/batch32_length128_0810/1660123651'
output_saved_model_dir = 'trt_saved_model/batch32_length128_0810/1/'
converter = trt.TrtGraphConverter(
    input_saved_model_dir=input_saved_model_dir,
    max_workspace_size_bytes=(11<32),
    max_batch_size=32,
    minimum_segment_size=50,
    precision_mode='FP32',
    is_dynamic_op=True,
    maximum_cached_engines=1)

converter.convert()
converter.save(output_saved_model_dir)

Before using the code, you should add some code in tensorflow/python/compiler/tensorrt/trt_convert.py. The reference is here
After add code, the model could convert, but the time still no change.
Could some body help me about this?

chenl-chief closed this as completed Jul 20, 2022

chenl-chief reopened this Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to use tensorrt to speed up original tensorflow t5 exported saved_model? #306

Is it possible to use tensorrt to speed up original tensorflow t5 exported saved_model? #306

chenl-chief commented Jul 19, 2022 •

edited

Loading

mihaimaruseac commented Jul 19, 2022

chenl-chief commented Jul 20, 2022

chenl-chief commented Jul 20, 2022

chenl-chief commented Aug 11, 2022

Is it possible to use tensorrt to speed up original tensorflow t5 exported saved_model? #306

Is it possible to use tensorrt to speed up original tensorflow t5 exported saved_model? #306

Comments

chenl-chief commented Jul 19, 2022 • edited Loading

mihaimaruseac commented Jul 19, 2022

chenl-chief commented Jul 20, 2022

chenl-chief commented Jul 20, 2022

chenl-chief commented Aug 11, 2022

chenl-chief commented Jul 19, 2022 •

edited

Loading