Skip to content

Commit

Permalink
Merge branch 'matcopy_batch_pr' into omatadd_batch_pr
Browse files Browse the repository at this point in the history
  • Loading branch information
OuadiElfarouki committed Aug 15, 2023
2 parents beb08e9 + f20b991 commit 2278a5f
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 4 deletions.
6 changes: 5 additions & 1 deletion include/operations/extension/transpose.h
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,11 @@ make_transpose(in_t &A, index_t inc_a, index_t &stride_a, out_t &At,
* while remaining customizable Tiling-size wise.
*
* @tparam both_trans Whether both A & B matrices are transposed (or just the
* first one)
* first one A). In fact, this kernel is implemented in such a way that if
* only one matrix is transposed in the OmatAdd operator, it should be placed
* as the first operand to this kernel (A & alpha). This reduces the original 4
* possible combinations of A and B transpose cases into only two cases depicted
* by the template parameter both_trans.
* @tparam Tile_size Tiling size used explicitly in the local memory kernel, and
* used to compute work-group size in the non-local memory case.
* @tparam wg_size work group size
Expand Down
6 changes: 3 additions & 3 deletions src/operations/extension/transpose.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -445,9 +445,9 @@ TransposeAdd<both_trans, Tile_size, wg_size, cl_size, local_memory, in1_t,
cl::sycl::nd_item<1> id) {
value_t *local = local_mem.localAcc.get_pointer();

auto A = A_.get_data().get_pointer();
auto B = B_.get_data().get_pointer();
auto C = C_.get_data().get_pointer();
auto A = A_.get_pointer();
auto B = B_.get_pointer();
auto C = C_.get_pointer();

index_t in_a_idx, in_b_idx, in_local_id, out_idx, out_local_id;
index_t i_block_start, j_block_start;
Expand Down

0 comments on commit 2278a5f

Please sign in to comment.