Running VIT on android #19093
-
Hello! I'm trying to run vision transformer with onnxruntime on an android device. In Python, the model was finetuned to classify 14 classes and converted to onnx format. After that, preprocessing steps were added to the model (resize, conversion to float, normalization). Next, the model was tested on 10k images with an accuracy of 98%. Model file in onnx format attached: On current step all external preprocessing you need in python to correctly run the model is like this:
Then i tried to convert this code to kotlin. My kotlin code is below:
In kotlin, model classify every input image as single "trash" class (1 of 14). Similar result can be achieved by feeding trash images to the model. So, probably i preprocess images wrong. Can you guide me towards what step is missing/incorrect in kotlin preprocessing? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Isn't the kotlin code emitting channels last, when the model expects channels first? |
Beta Was this translation helpful? Give feedback.
Isn't the kotlin code emitting channels last, when the model expects channels first?