You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A converted model contains the frozen variables (const), the original variables, and the variables saved into the TRT engine as weights. This can lead up to 3x size of the converted model compared to the original.
Expected behaviour
The original Variables should not be saved in the converted model, only the frozen ones and the TRT weights.
Explanation
By default, TF-TRT freezes the model variables before converting the graph. By definition, freezing the model means that the variables are converted to constants.
In this process new Const nodes are added to the graph, which will serve as inputs for nodes that were previously taking Variables as inputs.
After we add the Const nodes to the graph, the original variables are not used by the model anymore. Therefore their values shall not be saved in the saved model.
The original function (matmul_func) contains 8 MiB of variable.data. In the converted model, the parameters shall be stored in the frozen model (saved_model.pb) as well as in the serialized engine (the actual size of these depends on conversion parameters, TRT version, and target GPU). But the variables.data is not needed by the converted model therefore it shall not be saved.
This output was produced using the nvcr.io/nvidia/tensorflow:22.06-tf2-py3 docker image on a T4 gpu.
The text was updated successfully, but these errors were encountered:
A converted model contains the frozen variables (const), the original variables, and the variables saved into the TRT engine as weights. This can lead up to 3x size of the converted model compared to the original.
Expected behaviour
The original
Variables
should not be saved in the converted model, only the frozen ones and the TRT weights.Explanation
Const
nodes are added to the graph, which will serve as inputs for nodes that were previously takingVariable
s as inputs.Const
nodes to the graph, the original variables are not used by the model anymore. Therefore their values shall not be saved in the saved model.Steps to reproduce
After excuting the following code we could see the following output:
The original function (
matmul_func
) contains 8 MiB ofvariable.data
. In the converted model, the parameters shall be stored in the frozen model (saved_model.pb
) as well as in the serialized engine (the actual size of these depends on conversion parameters, TRT version, and target GPU). But thevariables.data
is not needed by the converted model therefore it shall not be saved.This output was produced using the
nvcr.io/nvidia/tensorflow:22.06-tf2-py3
docker image on a T4 gpu.The text was updated successfully, but these errors were encountered: