[QNN EP] Make QNN EP a shared library #23120

adrianlizarraga · 2024-12-16T08:57:04Z

Description

Makes QNN EP a shared library by default when building with --use_qnn or --use_qnn shared_lib. Generates the following build artifacts:
- Windows: onnxruntime_providers_qnn.dll and onnxruntime_providers_shared.dll
- Linux: libonnxruntime_providers_qnn.so and libonnxruntime_providers_shared.so
- Android: Not supported. Must build QNN EP as a static library.
Allows QNN EP to still be built as a static library with --use_qnn static_lib. This is primarily for the Android QNN AAR package.
Unit tests run for both the static and shared QNN EP builds.

Detailed changes

Updates Java bindings to support both shared and static QNN EP builds.
Provider bridge API:
- Adds logging sink ETW to the provider bridge. Allows EPs to register ETW callbacks for ORT logging.
- Adds a variety of methods for onnxruntime objects that are needed by QNN EP.
QNN EP:
- Adds ort_api.h and ort_api.cc that encapsulates the API provided by ORT in a manner that allows the EP to be built as either a shared or static library.
- Adds custom function to transpose weights for Conv and Gemm (instead of adding util to provider bridge API).
- Adds custom function to quantize data for LeakyRelu (instead of adding util to provider bridge API).
- Adds custom ETW tracing for QNN profiling events:
  - shared library: defines its own TraceLogging provider handle
  - static library: uses ORT's TraceLogging provider handle and existing telemetry provider.
ORT-QNN Packages:
- Python: Pipelines build QNN EP as a shared library by default. User can build a local python wheel with QNN EP as a static library by passing --use_qnn static_lib.
- NuGet: Pipelines build QNN EP as a shared library by default. build.py currently enforces QNN EP to be built as a shared library. Can add support for building a QNN NuGet package with static later if deemed necessary.
- Android: Pipelines build QNN EP as a static library. build.py enforces QNN EP to be built as a static library. Packaging multiple shared libraries into an Android AAR package is not currently supported due to the added need to also distribute a shared libcpp.so library.

Motivation and Context

…osed by the provider bridge.

…evert this in favor of doing the transpose manually in QNN EP

…entType(), DataTypeImpl::TensorTypeFromONNXEnum()

…tions not available in the provider bridge.

…+ iterators

…hat does not need to add new functionality to the provider bridge

…tructor

…d bug in qnn_configs_helper

cmake/onnxruntime_providers_qnn.cmake

cmake/onnxruntime_java.cmake

tools/ci_build/build.py

onnxruntime/core/providers/qnn/qnn_telemetry.h

…ared_lib; include warnings

onnxruntime/core/providers/qnn/builder/opbuilder/base_op_builder.cc

HectorSVC · 2025-01-06T21:48:06Z

onnxruntime/core/providers/qnn/builder/opbuilder/base_op_builder.cc

-    new_tensor_shape_dims.push_back(tensor_shape_dims[p]);
+// Internal function to transpose data of rank 5 with the given permutation.
+// Example: transpose input from either (N,C,H,W,D) or (C,N,H,W,D) to (H,W,D,C,N).
+static Status TransposeDataRank5(const TensorShape& input_shape,


What's the reason to replace the existing TransposeInitializer method? The existing one can handle any rank. Is it because it's re-using CPU EP implementation?

That's right. We would need to add the existing CPU EP implementation to the provider bridge. Since we're only using Transpose of rank5 and rank2 in QNN EP, I don't think it is worth the complexity.

Also, looking ahead to the EP-as-plugins project, we want to minimize the API between ORT and EPs.

HectorSVC · 2025-01-06T21:56:32Z

onnxruntime/core/providers/qnn/builder/opbuilder/reduce_op_builder.cc

-      // Copy initializer bytes (stored in little-endian order) to vector of int64_t.
-      // ReadLittleEndian returns a status error if the source and destination spans do not have
-      // matching byte sizes.
-      ORT_RETURN_IF_ERROR(onnxruntime::utils::ReadLittleEndian(src_span, dst_span));


I can't remember why we used ReadLittleEndian here. It should be safe for EP level right?

I originally added the use of ReadLittlenEndian. It is unnecessary because QNN EP only runs on little endian architectures. We assume little-endian throughout the code.

Should probably add a check for little-endian in the function that creates a QnnProviderFactory (fail if not little-endian).

This should be transparent to EPs. TensorProtocolUtils in framework should have covered this already.

Yes, that's right. When we call UnpackInitializer(initializer, buffer) tensorprotoutils correctly handles reading the data from onnx initializer and storing it in a little-endian byte buffer. However, when we are directly copying this buffer of bytes to an gsl::span<int32_t>, we're implicitly assuming that QNN EP is also running on a little-endian machine. This is why I initially added the call to ReadLittleEndian here. However, there are many places in QNN EP where we just reinterpret_cast initializer bytes (little-endian) into a data type like float, which assumes little-endian. So, I'm thinking we just formalize this and say QNN EP currently on runs on little endian machines.

HectorSVC · 2025-01-06T22:07:40Z

onnxruntime/core/providers/qnn/builder/opbuilder/conv_op_builder.cc

+      std::vector<uint8_t> original_tensor_bytes;
+      ORT_RETURN_IF_ERROR(qnn_model_wrapper.UnpackInitializerData(*input_info.initializer_tensor, original_tensor_bytes));
+      unpacked_tensor.resize(original_tensor_bytes.size());
+      size_t elem_byte_size = qnn::utils::GetElementSizeByType(


need to validate elem_byte_size to make sure it's not 0.

…plate code)

… PR was merged into main

RyanUnderhill · 2025-01-10T22:35:14Z

onnxruntime/core/providers/shared_library/provider_wrappedtypes.h

+
+  NodeUnit() = delete;
+  NodeUnit(const NodeUnit&) = delete;
+  void operator=(const NodeUnit& v) = delete;


Why not use PROVIDER_DISALLOW_ALL here?

We need to define the delete operator because another API returns a vector of unique_ptr<NodeUnit>.

RyanUnderhill · 2025-01-10T22:35:24Z

onnxruntime/core/providers/shared_library/provider_wrappedtypes.h

+  Node_EdgeEnd() = delete;
+  Node_EdgeEnd(const Node_EdgeEnd&) = delete;
+  void operator=(const Node_EdgeEnd&) = delete;


Why not use PROVIDER_DISALLOW_ALL here?

Updated to use PROVIDER_DISALLOW_ALL. thanks

RyanUnderhill · 2025-01-10T22:39:36Z

onnxruntime/core/session/provider_bridge_ort.cc

+static ProviderLibrary s_library_qnn(LIBRARY_PREFIX ORT_TSTR("onnxruntime_providers_qnn") LIBRARY_EXTENSION
+#ifndef _WIN32
+                                     ,
+                                     false /* unload - On Linux if we unload the qnn shared provider we crash */


Do we crash or was this just copied from the others?

good catch. I meant to test this out. I pushed an experimental commit to see if it crashes on any linux CI pipeline. I'll report back with results.

All linux pipelines pass and I checked locally on Ubuntu VM that unload does not crash. Thanks again for catching this.

…ux causes a crash

adrianlizarraga added 30 commits December 7, 2024 00:09

Copy shared utils into qnn ep

d035fb4

Update QNN EP's initializer transpose logic to use only functions exp…

7e46a7d

…osed by the provider bridge.

Update comment

a155b33

Added TransposeBase::DoTranspose() to provider bridge. May elect to r…

e9c5f14

…evert this in favor of doing the transpose manually in QNN EP

Add TypeProto_Tensor_has_elem_type() to provider bridge

d0f64dc

Add to provider bridge: TensorTypeBase class, TensorTypeBase::GetElem…

f8bd2f6

…entType(), DataTypeImpl::TensorTypeFromONNXEnum()

Transpose initializers within QNN EP without using CPU EP utils

1f533a9

Rename transpose func

ccaefb3

Remove TransposeBase forward declaration from provider bridge

fb765c7

Rewrite SliceOpBuilder util GetInitializerInputData() to not use func…

0237bca

…tions not available in the provider bridge.

Revert addition of TensorTypeBase to provider bridge

e3705b2

Remove last use of GetTensorShapeFromTensorProto

6f0b3c6

Add DataTypeUtils::ToType(std::string&) to provider bridge

5939bf6

Add Logger::GetSeverity() to provider bridge

58dbf49

Add TensorShapeProto_Dimensions__size to provider bridge

f76b09a

Add utils::CreateSupportedPartitions() to provider bridege

d189fe6

Merge main and fix conflicts

a0b3c75

Use new namespace for NodeAttrHelper

48191ea

Add to provider bridge: GraphViewer::Nodes(), ConstGraphNodes struct …

e6afd72

…+ iterators

Replace usage of cbegin() and cend() in NodeAttrHelper with version t…

0b1e538

…hat does not need to add new functionality to the provider bridge

Add convenience function to get the default Env to provider bridge

fb3618d

Moving ORT includes to a separate header

6b581fd

Add Node::EdgeEnd wrapper class to provider bridge. Add NodeUnit cons…

a1129e5

…tructor

Move more header includes to ort_api.h

ba86c41

Add GraphViewer::NodeProducesGraphOutput() to provider bridge

2b1ea09

Replace use of InlinedVector with std::vector and fix newly discovere…

421cd78

…d bug in qnn_configs_helper

Eliminate use of qmath.h by introducing new quantization utils for QNN

d94e6f7

Move includes into qnn/ort_api.h

4eb1e80

Add TensorProto::has_data_type() to provider bridge

d86fb6c

Checkpoint: updating usage of provider bridge in ep

187e3b9

adrianlizarraga requested review from snnn, jywu-msft, ivberg and edgchen1 January 4, 2025 00:59

snnn reviewed Jan 4, 2025

View reviewed changes

cmake/onnxruntime_providers_qnn.cmake Outdated Show resolved Hide resolved

adrianlizarraga requested a review from RyanUnderhill January 4, 2025 04:26

adrianlizarraga commented Jan 4, 2025

View reviewed changes

cmake/onnxruntime_java.cmake Outdated Show resolved Hide resolved

tools/ci_build/build.py Outdated Show resolved Hide resolved

adrianlizarraga commented Jan 4, 2025

View reviewed changes

onnxruntime/core/providers/qnn/qnn_telemetry.h Show resolved Hide resolved

adrianlizarraga added 2 commits January 6, 2025 09:16

Merge branch 'main' into adrianl/qnn-ep-dynamic-lib

98f40ad

review comments: remove no-op cmake statement; change to --use_qnn sh…

4f5c037

…ared_lib; include warnings

HectorSVC reviewed Jan 6, 2025

View reviewed changes

onnxruntime/core/providers/qnn/builder/opbuilder/base_op_builder.cc Show resolved Hide resolved

HectorSVC reviewed Jan 6, 2025

View reviewed changes

onnxruntime/core/providers/qnn/builder/opbuilder/base_op_builder.cc Show resolved Hide resolved

HectorSVC reviewed Jan 6, 2025

View reviewed changes

adrianlizarraga added 4 commits January 6, 2025 15:02

Create generic Factory to create ort objects (reduce amount of boiler…

194b7b8

…plate code)

Review comments: reuse transpose functions

d6c5942

Update function use to create logging::Capture

d170a2d

Merge branch 'main' into adrianl/qnn-ep-dynamic-lib

d8f5067

jywu-msft requested review from jslhcl and 007zszmz January 9, 2025 18:40

adrianlizarraga added 2 commits January 9, 2025 17:12

Merge main and fix conflicts

9ef9622

Remove GetClipMinMax() because it is no longer needed after separater…

5ee8c43

… PR was merged into main

HectorSVC previously approved these changes Jan 10, 2025

View reviewed changes

RyanUnderhill reviewed Jan 10, 2025

View reviewed changes

adrianlizarraga added 4 commits January 10, 2025 16:13

Use PROVIDER_DISALLOW_ALL for Node_EdgeEnd struct in provider bridge

8858190

Remove unused function Capture_Create()

e93705d

Merge branch 'main' into adrianl/qnn-ep-dynamic-lib

b53881c

Experiment to see of unloading libonnxruntime_providers_qnn.so on Lin…

8dda435

…ux causes a crash

adrianlizarraga dismissed HectorSVC’s stale review via 8dda435 January 11, 2025 00:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QNN EP] Make QNN EP a shared library #23120

[QNN EP] Make QNN EP a shared library #23120

adrianlizarraga commented Dec 16, 2024 •

edited

Loading

HectorSVC Jan 6, 2025

adrianlizarraga Jan 6, 2025

HectorSVC Jan 6, 2025

adrianlizarraga Jan 6, 2025

HectorSVC Jan 8, 2025

adrianlizarraga Jan 8, 2025 •

edited

Loading

HectorSVC Jan 6, 2025

adrianlizarraga Jan 8, 2025

RyanUnderhill Jan 10, 2025

adrianlizarraga Jan 11, 2025

RyanUnderhill Jan 10, 2025

adrianlizarraga Jan 11, 2025

RyanUnderhill Jan 10, 2025

adrianlizarraga Jan 11, 2025

adrianlizarraga Jan 11, 2025

[QNN EP] Make QNN EP a shared library #23120

Are you sure you want to change the base?

[QNN EP] Make QNN EP a shared library #23120

Conversation

adrianlizarraga commented Dec 16, 2024 • edited Loading

Description

Detailed changes

Motivation and Context

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrianlizarraga Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrianlizarraga commented Dec 16, 2024 •

edited

Loading

adrianlizarraga Jan 8, 2025 •

edited

Loading