-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use row major when building attributes #307
Conversation
so discussing with @wsmoses about this in today's meeting, we have reached the conclusion that the column- vs row-major layout is not forced by MLIR but it's dialect specific (e.g. a Julia MLIR dialect could still use column-major layout). furthermore, I just found that column-major layouts can be represented by affine maps https://mlir.llvm.org/docs/Dialects/Builtin/#affine-map-layout do we know if affine maps can be used in tensors? the only examples I found are with memrefs. also, is StableHLO compatible with them? I still believe that this PR is useful, but I would suggest to make the conversion from column- to row-major optional. maybe a kwarg that defaults to |
I agree but here this is for building shaped builtin attributes through the C API. Row major is needed for the data and shape to be interpreted accordingly. A dialect is then free to reinterpret the attribute as suited within its ops.
I don't really see why one would not want to convert to row major here. We can add it later when/if a potential julia dialect needs transposed attributes. |
i see your point but I'm not sure of all the consequence. how about we discuss it the next meeting? |
Yeah okay I can agree with that logic (since then the bultin attribute conversion is nice — which is distinct from the builtin op semantics). im okay with this PR |
@@ -492,6 +492,9 @@ function Base.fill(::Core.Type{Attribute}, value, shape) | |||
return Base.fill(value, shaped_type) | |||
end | |||
|
|||
to_row_major(x) = permutedims(x, ndims(x):-1:1) | |||
to_row_major(x::AbstractVector) = x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the error logs it looks like this also needs a 0-dim specialization
do you know if affine maps can be associated to dense array / dense elements attributes? |
Not attributes directly, but as part of a type |
okay, and can be associated to a |
yeah, but we pass the values to |
I don’t think tensors do, and also that would still be an issue here since it having a different type means that you couldn’t use a memref where you needed a regular affine map. still cool to use, but I think this is necessary regardless |
Note that the fix here not only affects Reactant.jl/ext/ReactantNNlibExt.jl Lines 111 to 113 in 45ae14f
If we want to try not having the transpose in julia, it should be possible by doing it at the promotion level: function promote_to(x)
cst = stablehlo.constant(reshape(x, :))
cst = stablehlo.reshape(cst, reverse(size(x)))
cst = stablehlo.transpose(cst, ndims(x):-1:1)
return cst
end |
Yeah but frankly that’s the more intuitive thing anyways |
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, then let's merge it like this and revisit in the future if we need it
@@ -109,7 +109,7 @@ function NNlib.conv!( | |||
#! format: on | |||
|
|||
padding = Reactant.MLIR.IR.DenseElementsAttribute( | |||
reshape(collect(padding), (num_spatial_dims, 2)) | |||
reshape(collect(padding), (2, num_spatial_dims))' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't call '
/adjoint
because it will conjugate complex matrices
reshape(collect(padding), (2, num_spatial_dims))' | |
transpose(reshape(collect(padding), (2, num_spatial_dims))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not know about that. As you said, it does not apply here but I will be cautious in the future 👍
@@ -163,7 +163,7 @@ function reduce_window(f, x::AnyTracedRArray{T,N}, pdims; init) where {T,N} | |||
end | |||
|
|||
padding = Reactant.MLIR.IR.DenseElementsAttribute( | |||
reshape([padding..., 0, 0, 0, 0], (N, 2)) | |||
reshape([padding..., 0, 0, 0, 0], (2, N))' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reshape([padding..., 0, 0, 0, 0], (2, N))' | |
transpose(reshape([padding..., 0, 0, 0, 0], (2, N))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, NNlib.padding
just returns a tuple of ints... so it's alright if you call adjoint
No description provided.