From 218dc6830ed487cbaf2c9a950d4110eb19c4e03b Mon Sep 17 00:00:00 2001 From: Joaquin Anton Date: Wed, 11 Oct 2023 14:46:14 +0200 Subject: [PATCH] Add ResNet preproc version 2 (with image decoding) Signed-off-by: Joaquin Anton --- vision/classification/resnet/README.md | 18 +++++++++++------- .../resnet/preproc/resnet-preproc-v2-20.onnx | 3 +++ .../resnet/preproc/resnet-preproc-v2-20.tar.gz | 3 +++ 3 files changed, 17 insertions(+), 7 deletions(-) create mode 100644 vision/classification/resnet/preproc/resnet-preproc-v2-20.onnx create mode 100644 vision/classification/resnet/preproc/resnet-preproc-v2-20.tar.gz diff --git a/vision/classification/resnet/README.md b/vision/classification/resnet/README.md index 10454c05e..d91b040d7 100644 --- a/vision/classification/resnet/README.md +++ b/vision/classification/resnet/README.md @@ -70,7 +70,7 @@ The inference was done using jpeg image. ### Preprocessing The image needs to be preprocessed before fed to the network. -The first step is to extract a 224x224 crop from the center of the image. For this, the image is first scaled to a minimum size of 256x256, while keeping aspect ratio. That is, the shortest side of the image is resized to 256 and the other side is scaled accordingly to maintain the original aspect ratio. After that, the image is normalized with mean = 255*[0.485, 0.456, 0.406] and std = 255*[0.229, 0.224, 0.225]. Last step is to transpose it from HWC to CHW layout. +The first step is to decode the image. Next, we extract a 224x224 crop from the center of the image. For this, the image is first scaled to a minimum size of 256x256, while keeping aspect ratio. That is, the shortest side of the image is resized to 256 and the other side is scaled accordingly to maintain the original aspect ratio. After that, the image is normalized with mean = 255*[0.485, 0.456, 0.406] and std = 255*[0.229, 0.224, 0.225]. Last step is to transpose it from HWC to CHW layout. The described preprocessing steps can be represented with an ONNX model: ```python @@ -81,22 +81,23 @@ from onnx import checker resnet_preproc = parser.parse_model(''' < ir_version: 8, - opset_import: [ "" : 18, "local" : 1 ], + opset_import: [ "" : 20, "local" : 1 ], metadata_props: [ "preprocessing_fn" : "local.preprocess"] > -resnet_preproc_g (seq(uint8[?, ?, 3]) images) => (float[B, 3, 224, 224] preproc_data) +resnet_preproc_g (seq(uint8[?]) images) => (float[B, 3, 224, 224] preproc_data) { preproc_data = local.preprocess(images) } < - opset_import: [ "" : 18 ], + opset_import: [ "" : 20 ], domain: "local", - doc_string: "Preprocessing function." + doc_string: "Preprocessing function, including image decoding." > preprocess (input_batch) => (output_tensor) { tmp_seq = SequenceMap < - body = sample_preprocessing(uint8[?, ?, 3] sample_in) => (float[3, 224, 224] sample_out) { + body = sample_preprocessing(uint8[?] sample_in) => (float[3, 224, 224] sample_out) { + image = ImageDecoder (sample_in) target_size = Constant () image_resized = Resize