1- Support TransformerNLU model conversion to tflite format using 3 different conversion modes:
- normal
- fp16_quantization
- hybrid_quantization
Note: based on your Tensorflow version and your environment, you may or may not be able to do the conversion and serving. Please, refer to https://www.tensorflow.org/lite/guide/ops_select#python for more details.
2- Support of tflite model serving utilizing python multiprocessing feature.