forked from iree-org/iree
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for a variety of (data tiled) convolution strategies #63
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Adds the ability to use the transform dialect strategy builders behind `iree-spirv-enable-transform-dialect-jit`, mirroring the existing flags for LLVMCPU/GPU.
DetachElementwiseFromNamedOps is used to replace pre-filled outputs with a zero-fill + add for contracting ops (gemm, conv). This extends the pattern to the convolution interface to allow non-named cases. Renaming of the pass can happen as a follow up if/when this is upstreamed.
Towards pad fused convolution strategies.
Removes the restriction for named ops only on the convolution matcher, instead using the interface.
Adds a builder for mapping data tiled convolutions to a direct tensorcore approach (mainly targeting wmma for now). This generates a loop over the input channels, promotion of the padded input tile to shared memory, and then two more inner loops over the convolution filter.
Adds a direct SIMT(/fma/dot4) conv approach without shared memory.
Allows matching non-named contraction ops, using the same MatmulOpCaptures struct that exists for matmul and batch matmul
…r strategies Maps data tiled matmuls to tensor core, assuming no distribution is expected to happen over the inner tile.
Additionally improve distribution of pad copies for convolution strategy by greedily distributing over the outer most dimensions of the copy.
Currently pad fusion only applies to named convolutions. This allows it to apply based on the interface.
<32 bit width types are handled on the SPIR-V side by introducing bitcasts to and from i32 and bubbling them to the center of the kernel hoping to cancel. This adds a pattern for a bitcast on the result of an scf.if, which comes from the way that padding is handled (transfer_read in the `then` branch, else yield a splat constant).
powderluv
pushed a commit
that referenced
this pull request
Sep 25, 2023
I had built these out while working on root causing a regression. Cleaned them up and mainlining them. They are controlled by compile time variables for the moment and we can do something smarter later.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.