Does onnxruntime on macOS with apple silicon uses arm-specific instructions when running quantized models? #14916

Rexhaif · 2023-03-05T12:38:36Z

Rexhaif
Mar 5, 2023

Hi,

I'm doing some research with quantized models and i want to confirm one thing.

Does onnxruntime python package, installed on MacOS with Apple Silicon, actually uses arm-specific instructions(like faster int8 matmul) when running inference of quantized onnx models?

Specifically, i'm using onnxruntime through optimum library, and i am running BERT-base model, dynamically quantized for arm64 with this config(i.e u8/s8 quantization). My hardware is MBP 13 with Apple M1, macOS 13.2.1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does onnxruntime on macOS with apple silicon uses arm-specific instructions when running quantized models? #14916

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Does onnxruntime on macOS with apple silicon uses arm-specific instructions when running quantized models? #14916

Rexhaif Mar 5, 2023

Replies: 0 comments

Rexhaif
Mar 5, 2023