Proposal: more general strides and sizes in perf configs, etc. #1140

krzysz00 · 2023-07-05T19:58:11Z

krzysz00
Jul 5, 2023
Maintainer

We'd like to be able to represent more complex size/stride combinations in the perf config without adding too many special cases like the transpose flags or the layout string.

This'll allow us to easily support things like computing on tiles of a larger tensor and NCHWC.

It'll also clean up the code and give us some more generality.

I propose:

Standard layouts for the `rock` operations

I propose that rock.conv2d will always take a GNHWC view of the underlying memory, though that choice in somewhat arbitrary and we could go with NGHWC or GNCHW or what have you.

rock.gemm will take matrix A as M x K and matrix B as K x N .

The {filter,input,output}_layout and transpose{A,B,C} will be removed.

Argument passing

Kernels will be given a 1D memref of size [actual underlying memory size]xT . You could even make it an i8 buffer for extra spicy.

This'll then be passed to an operation I'm going to call rock.interpret_memory {sizes = [l0, l1, ... lN], strides = [s0, s1, ..., sN]) : tensor<LxT> -> tensor<l0xl1x...xlNxT> (which bufferizes, and, early in the kernel pipeline, expands out to rock.transform`).

Then, you get to do transposes, reshapes, what have you, to break that apart and recombine it however you like.

Heck, you might not always need interpret_memory - the common cases are just reshape.

What we change in our code later

We add the function rock::sizesAndStridesFor([pile of transforms], SmallVectorImpl<SmallVector<int64_t>> &sizes, SmallVectorImpl<SmallVector<int64_t>>& strides).

The goal of this is to traverse the transform stack and give you the size(s) and stride(s) of the component dimension(s) of each dimension in the input.

For example, if I have

(%rawA : memref<20xf32>, ...) {
  %matA = reshape %rawA : memref<20xf32> -> memref<4x5xf32>  // K x M
  %transA = transpose %matA ([1, 0]) : memref<4x5xf32> -> memref<5x4xf32> // M x K
  rock.gemm ... = %transA * ...
}

then sizesAndStridesFor(%transA, sizes, strides) would set sizes to [[5], [4]] and strides to [[1], [5]]

As a more complex example, if I had

(%rawI : memref<72xf32>, ...) {
  %dimsI = reshape(%rawI) : memref<72xf32> -> memref<1x2x3x3x4xf32> // NCHWC
  %collapsedI = collapse_shape { [[0], [2], [3], [1, 4]] } (%dimsI) : memref<1x2x3x3x4xf32> -> memref<1x3x3x8xf32> // NHWC
  rock.conv2d(..., %collapsedI, ...)
}

we'd have getSizesAndStridesFor(%collapsedI, sizes, strides) producing sizes = [[1], [3], [3], [2, 4]] and strides = [[72], [12], [4], [36, 1]], thus expressing the NCHWC layout.

We would use this information when generating problem config strings (see below) and when making the arbitrary decisions of how tho construct gemm{M,N,K} during conv-to-gemm.

Problem configs

We would still support the old form (stuff like -in_layout nchw) but translate it to the new form quickly on contact, and we might decide we want to deprecate it.

Instead, in problem keys, we'll have (using gemm as an example) keys such as -a_m_size, -a_m_stride, -b_n_size, -b_n_stride and so on.

In the simple case where there aren't discontinuities, we'd have (using our first gemm example) keys like ... -a_m_size 5 -a_m_stride 1 -a_k_size 4 -a_k_stride 5 ...

For more complex cases, like my second example, we'd have problem key entriies like -in_c_size [2, 4] -in_c_stride [36,1] (or we could drop the brackets).

This'd give us much more generality while simplifying our input format.

This is all a very high-level rough sketch of what I'm thinking, please feel free to lob clarifying questions at it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: more general strides and sizes in perf configs, etc. #1140

{{title}}

Replies: 0 comments

Select a reply

Proposal: more general strides and sizes in perf configs, etc. #1140

krzysz00 Jul 5, 2023 Maintainer

Standard layouts for the rock operations

Argument passing

What we change in our code later

Problem configs

Replies: 0 comments

krzysz00
Jul 5, 2023
Maintainer

Standard layouts for the `rock` operations