Skip to content

Commit

Permalink
src: cpu: aarch64: lowp_matmul_sq: Make weights constant
Browse files Browse the repository at this point in the history
Setting the weights as constant allows us to avoid redundant pretranspose
operations in Arm Compute Library (ACL) every time execute is called
(they are now run once and cached). This delives big speedups especially
for relatively small matmuls.
Note that this is a temp fix that needs to be handled carefully by primitive
caches in frameworks, since the ACL object is now holding more state - i.e.
we want to make sure that the cahce maps a layer with a specific set of weights
to the oneDNN primitive storing those weights.
We're currently working on the proper fix for this which involves making
lowp_gemm stateless and fixed-format in ACL and oneDNN.
  • Loading branch information
fadara01 committed Nov 8, 2024
1 parent efad4f6 commit 0358abf
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/cpu/aarch64/matmul/acl_lowp_matmul_sq.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ struct acl_lowp_matmul_sq_t : public primitive_t {
arm_compute::TensorShape(N(), K()), 1,
acl_utils::get_acl_data_t(wei_d.data_type(), true),
arm_compute::QuantizationInfo(1.0, 0, true));
almc_.wei_tensor_info.set_are_values_constant(false);
almc_.wei_tensor_info.set_are_values_constant(true);

almc_.dst_tensor_info = arm_compute::TensorInfo(
arm_compute::TensorShape(N(), M()), 1,
Expand Down

0 comments on commit 0358abf

Please sign in to comment.