[CPU][ONNX] Onnx test failures after pulling in torch-mlir changes #18961

Max191 · 2024-10-31T13:36:13Z

There were 3 new test failures after pulling in a torch-mlir patch: llvm/torch-mlir@55ff110

The following tests failed:

// RUN: iree-compile /tmp/test.mlir --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu=generic --iree-input-demote-f64-to-f32=false --mlir-disable-threading --mlir-print-ir-after-all -o /tmp/out.vmfb &> /tmp/dump.mlir

module {
  func.func @test_tfidfvectorizer_tf_batch_onlybigrams_skip0(%arg0: !torch.vtensor<[2,6],si32>) -> !torch.vtensor<[2,7],f32> attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 17 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
    %none = torch.constant.none
    %0 = torch.operator "onnx.TfIdfVectorizer"(%arg0) {torch.onnx.max_gram_length = 2 : si64, torch.onnx.max_skip_count = 0 : si64, torch.onnx.min_gram_length = 2 : si64, torch.onnx.mode = "TF", torch.onnx.ngram_counts = [0 : si64, 4 : si64], torch.onnx.ngram_indexes = [0 : si64, 1 : si64, 2 : si64, 3 : si64, 4 : si64, 5 : si64, 6 : si64], torch.onnx.pool_int64s = [2 : si64, 3 : si64, 5 : si64, 4 : si64, 5 : si64, 6 : si64, 7 : si64, 8 : si64, 6 : si64, 7 : si64]} : (!torch.vtensor<[2,6],si32>) -> !torch.vtensor<[2,7],f32> 
    return %0 : !torch.vtensor<[2,7],f32>
  }
}

// -----

// RUN: iree-compile /tmp/test.mlir --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu=generic --iree-input-demote-f64-to-f32=false --mlir-disable-threading --mlir-print-ir-after-all -o /tmp/out.vmfb &> /tmp/dump.mlir

module {
  func.func @test_tfidfvectorizer_tf_batch_onlybigrams_skip5(%arg0: !torch.vtensor<[2,6],si32>) -> !torch.vtensor<[2,7],f32> attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 17 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
    %none = torch.constant.none
    %0 = torch.operator "onnx.TfIdfVectorizer"(%arg0) {torch.onnx.max_gram_length = 2 : si64, torch.onnx.max_skip_count = 5 : si64, torch.onnx.min_gram_length = 2 : si64, torch.onnx.mode = "TF", torch.onnx.ngram_counts = [0 : si64, 4 : si64], torch.onnx.ngram_indexes = [0 : si64, 1 : si64, 2 : si64, 3 : si64, 4 : si64, 5 : si64, 6 : si64], torch.onnx.pool_int64s = [2 : si64, 3 : si64, 5 : si64, 4 : si64, 5 : si64, 6 : si64, 7 : si64, 8 : si64, 6 : si64, 7 : si64]} : (!torch.vtensor<[2,6],si32>) -> !torch.vtensor<[2,7],f32> 
    return %0 : !torch.vtensor<[2,7],f32>
  }
}

// -----

// RUN: iree-compile /tmp/test.mlir --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu=generic --iree-input-demote-f64-to-f32=false --mlir-disable-threading --mlir-print-ir-after-all -o /tmp/out.vmfb &> /tmp/dump.mlir

module {
  func.func @test_tfidfvectorizer_tf_batch_uniandbigrams_skip5(%arg0: !torch.vtensor<[2,6],si32>) -> !torch.vtensor<[2,7],f32> attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 17 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
    %none = torch.constant.none
    %0 = torch.operator "onnx.TfIdfVectorizer"(%arg0) {torch.onnx.max_gram_length = 2 : si64, torch.onnx.max_skip_count = 5 : si64, torch.onnx.min_gram_length = 1 : si64, torch.onnx.mode = "TF", torch.onnx.ngram_counts = [0 : si64, 4 : si64], torch.onnx.ngram_indexes = [0 : si64, 1 : si64, 2 : si64, 3 : si64, 4 : si64, 5 : si64, 6 : si64], torch.onnx.pool_int64s = [2 : si64, 3 : si64, 5 : si64, 4 : si64, 5 : si64, 6 : si64, 7 : si64, 8 : si64, 6 : si64, 7 : si64]} : (!torch.vtensor<[2,6],si32>) -> !torch.vtensor<[2,7],f32> 
    return %0 : !torch.vtensor<[2,7],f32>
  }
}

The patch is reverted in IREE for now, so to reproduce the failures, use this branch that has the patch reapplied: https://github.com/Max191/iree/tree/onnx-cpu-unrolling-fail

The text was updated successfully, but these errors were encountered:

) - bump llvm-project to llvm/llvm-project@f1595ec - revert iree-org/llvm-project@1004865 due to compiler failures in VectorDistribute. Tracked in #18955. - bump stablehlo to openxla/stablehlo@c32f7c2 - bump torch-mlir to llvm/torch-mlir@8b0bf2e - revert llvm/torch-mlir@55ff110 due to new onnx failures. Tracked in #18961 --------- Signed-off-by: Max Dawkins <[email protected]>

zjgarvey · 2024-11-01T23:53:36Z

For context, I actually added this patch as a means to resolve an issue when unrolling loops for other tests see #18867 (comment).

In my opinion, it seems bad to indiscriminately unroll loops in the IR, so something actually needs to be addressed in the test examples. Why do the batched examples of TFIDF fail, but the unbatched ones pass?

It seems like IREE fails to handle arith.sitofp : i64 -> f64 when the input is the result of an scf.for loop, but not when those loops are unrolled?

For operations like this, I don't expect the result to be performant, but I just don't know what the constraints are from the IREE side. Do we straight up disallow any scalar scf loops?

zjgarvey · 2024-11-02T00:02:15Z

Maybe this comment from Ben is related: #18268 (comment)

Max191 mentioned this issue Oct 31, 2024

Integrate llvm-project @f1595ecfdce5387e41826fd72ff930a1a39ae398 #18897

Merged

zjgarvey mentioned this issue Nov 1, 2024

Revert "[MLIR][TORCH] Only unroll prim loop-like ops within a torch.shape.calculate region" llvm/torch-mlir#3849

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CPU][ONNX] Onnx test failures after pulling in torch-mlir changes #18961

[CPU][ONNX] Onnx test failures after pulling in torch-mlir changes #18961

Max191 commented Oct 31, 2024

zjgarvey commented Nov 1, 2024

zjgarvey commented Nov 2, 2024 •

edited

Loading

[CPU][ONNX] Onnx test failures after pulling in torch-mlir changes #18961

[CPU][ONNX] Onnx test failures after pulling in torch-mlir changes #18961

Comments

Max191 commented Oct 31, 2024

zjgarvey commented Nov 1, 2024

zjgarvey commented Nov 2, 2024 • edited Loading

zjgarvey commented Nov 2, 2024 •

edited

Loading