-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SME] Utilize predication in fp32 matmul and conv2d schedules
Prior to this commit, the matmul and conv2d schedules required padding of the inputs to some multiple of vscale and a final "unpadding" stage. Instead, we can leverage predicated operations to avoid the the requirement for padding. Both the transpose interleave and outer product fp32 intrinsics are updated to use predication. The `get_active_lane_mask` intrinsic is utilized to generate a variably sized mask of active lanes depending on the global position the tensor intrinsic is operating on. For now this relies on using `offset_of` and `stride` information from the tensor we're predicating an access on. Likely we will want to build on this in the future with a more intuitive API for determining the current tile location. Support for batched conv2d was removed since this causes numerical issues which is suspected to be due to how the current tile is determined (paragraph above). Change-Id: I79620200c9a94e2ca9d7297c4ed2abf87549cc41
- Loading branch information
Showing
7 changed files
with
189 additions
and
89 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.