This tutorial shows common tasks in developing ML in edge devices.
Very often, we will take a model developed in a common environment and then convert the model to an edge-specific version. One way to do this is to use TensorflowLite Converter.
Practice the following things:
- Download existing models, e.g., from TensorHub, Open Model Zoo or Hugging Face or https://github.com/EN10/KerasMNIST)
- Convert a normal tensorflow model to tensorflow lite
$tflite_convert --keras_model_file=models/mnist-model.h5 --output_file=models/mnist-model.tflite
Then study the conversion by answering the following questions:
- What is the difference w.r.t. the model size?
- Using a visualization tool (e.g., Netron) examine the graphs of two models? What differences do you see?
Assume that you get "cnn.h5" from (https://github.com/EN10/KerasMNIST), try to run src/simple-converter.py with/without turning on option 1 (see the code comments).
Turn off option 1:
$python src/simple-converter.py --i models/cnn.h5 --o test.tflite
Turn on option 1:
$python src/simple-converter.py --i models/cnn.h5 --o test16.tflite --s 1
Carry out the following activities:
- Check the saved output models. What is the difference w.r.t. the model size?
- Using a visualization tool (e.g., Netron) examine the graphs of two models? What differences do you see?
- Load and run the two models and check some runtime metrics?
Checking and optimization for ARM, e.g. using ML Inference Advisor:
$mlia check models/cnn.h5 -t cortex-a
$mlia optimize models/cnn.h5 -t cortex-a
- For which application domains/cases the model inference accuracy would be affected, when applying model conversion and quantification?