`tileConsumerAndFuseProducersUsingScf` stuck in the infinite loop. #18875

pashu123 · 2024-10-23T09:18:17Z

What happened?

The pass calls the function and it's stuck in the while loop here: https://github.com/llvm/llvm-project/blob/ac5a2010ad35a72de3e75a1883e2495345b92a73/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp#L1482

Steps to reproduce your issue

Input IR:

  func.func @time_out_dispatch_0_unpack_elementwise_1x1x1152_f32() attributes {translation_info = #iree_codegen.translation_info<CPUDoubleTilingExpert>} {
    %c0 = arith.constant 0 : index
    %0 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(0) alignment(64) offset(%c0) flags("ReadOnly|Indirect") : !flow.dispatch.tensor<readonly:tensor<1x1x288x8x4xf32>>
    %1 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(1) alignment(64) offset(%c0) flags("ReadOnly|Indirect") : !flow.dispatch.tensor<readonly:tensor<1x1x1152xf32>>
    %2 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(2) alignment(64) offset(%c0) flags(Indirect) : !flow.dispatch.tensor<writeonly:tensor<1x1x1152xf32>>
    %3 = flow.dispatch.tensor.load %0, offsets = [0, 0, 0, 0, 0], sizes = [1, 1, 288, 8, 4], strides = [1, 1, 1, 1, 1] : !flow.dispatch.tensor<readonly:tensor<1x1x288x8x4xf32>> -> tensor<1x1x288x8x4xf32>
    %4 = flow.dispatch.tensor.load %1, offsets = [0, 0, 0], sizes = [1, 1, 1152], strides = [1, 1, 1] : !flow.dispatch.tensor<readonly:tensor<1x1x1152xf32>> -> tensor<1x1x1152xf32>
    %5 = tensor.empty() : tensor<1x1x1152xf32>
    %unpack = tensor.unpack %3 outer_dims_perm = [0, 1, 2] inner_dims_pos = [1, 2] inner_tiles = [8, 4] into %5 {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[0, 0, 1152], [1, 8, 16], [0, 0, 0], [0, 0, 0]]>} : tensor<1x1x288x8x4xf32> -> tensor<1x1x1152xf32>
    %6 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>], iterator_types = ["parallel", "parallel", "parallel"]} ins(%4, %unpack : tensor<1x1x1152xf32>, tensor<1x1x1152xf32>) outs(%5 : tensor<1x1x1152xf32>) attrs =  {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[0, 0, 1152], [1, 8, 16], [0, 0, 0], [0, 0, 0]]>} {
    ^bb0(%in: f32, %in_0: f32, %out: f32):
      %7 = arith.addf %in, %in_0 : f32
      linalg.yield %7 : f32
    } -> tensor<1x1x1152xf32>
    flow.dispatch.tensor.store %6, %2, offsets = [0, 0, 0], sizes = [1, 1, 1152], strides = [1, 1, 1] : tensor<1x1x1152xf32> -> !flow.dispatch.tensor<writeonly:tensor<1x1x1152xf32>>
    return
  }

Pass:
iree-opt --pass-pipeline="builtin.module(func.func(iree-codegen-tile-and-distribute-to-workgroups-using-forall-op, cse))" --mlir-print-local-scope --split-input-file input.mlir

What component(s) does this issue relate to?

No response

Version information

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

pashu123 · 2024-10-23T10:41:14Z

With tensor.unpack with explicit slicing semantics this passes.

  func.func @time_out(%arg0: tensor<1x1x288x8x4xf32>, %arg1: tensor<1152xf32>) -> tensor<1x1x1152xf32> {
    %0 = tensor.empty() : tensor<1x1x1152xf32>
    %1 = tensor.empty() : tensor<1x8x1152xf32>
    %unpack = tensor.unpack %arg0 outer_dims_perm = [0, 1, 2] inner_dims_pos = [1, 2] inner_tiles = [8, 4] into %1 {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[0, 0, 1152], [1, 8, 16], [0, 0, 0], [0, 0, 0]]>} : tensor<1x1x288x8x4xf32> -> tensor<1x8x1152xf32>
    %extracted_slice = tensor.extract_slice %unpack[0, 0, 0] [1, 1, 1152] [1, 1, 1] : tensor<1x8x1152xf32> to tensor<1x1x1152xf32>
    %2 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2) -> (d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>], iterator_types = ["parallel", "parallel", "parallel"]} ins(%arg1, %extracted_slice : tensor<1152xf32>, tensor<1x1x1152xf32>) outs(%0 : tensor<1x1x1152xf32>) {
    ^bb0(%in: f32, %in_0: f32, %out: f32):
      %3 = arith.addf %in, %in_0 : f32
      linalg.yield %3 : f32
    } -> tensor<1x1x1152xf32>
    return %2 : tensor<1x1x1152xf32>
  }

Max191 · 2024-10-24T14:18:01Z

I opened a PR upstream which fixes this: llvm/llvm-project#113571

pashu123 · 2024-10-24T14:21:04Z

I opened a PR upstream which fixes this: llvm/llvm-project#113571

Thanks, @Max191 , for the fix!

pashu123 · 2024-10-25T12:33:25Z

Closing this the fix is merged.

Max191 · 2024-10-25T17:23:53Z

Closing this the fix is merged.

Sounds good. Note that it will not be fixed in IREE until #18897 lands

pashu123 added the bug 🐞 Something isn't working label Oct 23, 2024

pashu123 mentioned this issue Oct 23, 2024

[regression]: Increase in iree-compile memory for > 100X #18869

Open

pashu123 assigned Max191 Oct 23, 2024

pashu123 closed this as completed Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`tileConsumerAndFuseProducersUsingScf` stuck in the infinite loop. #18875

`tileConsumerAndFuseProducersUsingScf` stuck in the infinite loop. #18875

pashu123 commented Oct 23, 2024

pashu123 commented Oct 23, 2024

Max191 commented Oct 24, 2024

pashu123 commented Oct 24, 2024

pashu123 commented Oct 25, 2024

Max191 commented Oct 25, 2024

tileConsumerAndFuseProducersUsingScf stuck in the infinite loop. #18875

tileConsumerAndFuseProducersUsingScf stuck in the infinite loop. #18875

Comments

pashu123 commented Oct 23, 2024

What happened?

Steps to reproduce your issue

What component(s) does this issue relate to?

Version information

Additional context

pashu123 commented Oct 23, 2024

Max191 commented Oct 24, 2024

pashu123 commented Oct 24, 2024

pashu123 commented Oct 25, 2024

Max191 commented Oct 25, 2024

`tileConsumerAndFuseProducersUsingScf` stuck in the infinite loop. #18875

`tileConsumerAndFuseProducersUsingScf` stuck in the infinite loop. #18875