[InstCombine] Fold zext(X) + C2 pred C -> X + C3 pred C4 #110511

dtcxzyw · 2024-09-30T13:46:43Z

Motivating case from https://github.com/torvalds/linux/blob/9852d85ec9d492ebef56dc5f229416c925758edc/drivers/gpu/drm/drm_edid.c#L5238-L5240:

define i1 @src(i8 noundef %v13) {
entry:
  %conv1 = zext i8 %v13 to i32
  %add = add nsw i32 %conv1, -4
  %cmp = icmp ult i32 %add, 3
  %cmp4 = icmp slt i8 %v13, 4
  %cond = select i1 %cmp4, i1 true, i1 %cmp
  ret i1 %cond
}

define i1 @tgt(i8 noundef %v13) {
entry:
  %cmp4 = icmp slt i8 %v13, 7
  ret i1 %cmp4
}

goldsteinn · 2024-10-01T02:04:38Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

@@ -3165,6 +3165,26 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
                        Builder.CreateAdd(X, ConstantInt::get(Ty, *C2 - C - 1)),
                        ConstantInt::get(Ty, ~C));

+  // zext(V) + C2 <u C -> V + trunc(C2) <u trunc(C) iff C2 s<0 && C s>0


Think the 'iff ...' is pretty misleading given that you also need 'C2' and 'C' to fit in the new bitwidth.

dtcxzyw · 2024-10-06T07:36:23Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+    unsigned CmpBW = Ty->getScalarSizeInBits();
+    unsigned NewCmpBW = NewCmpTy->getScalarSizeInBits();
+    if (shouldChangeType(Ty, NewCmpTy)) {
+      if (auto ZExtCR = CR.exactIntersectWith(ConstantRange(


We need a new ConstantRange API to convert ranges for zext(V) to ranges for V.
For example, we can convert zext(i8 X to i32) - 255 u< -4 to X + 1 u< -4. But current implementation cannot achieve this.

goldsteinn · 2024-10-07T18:25:33Z

Is this intentionally still in draft form? Or are you ready for review?

llvmbot · 2024-10-08T00:57:13Z

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

Motivating case from https://github.com/torvalds/linux/blob/9852d85ec9d492ebef56dc5f229416c925758edc/drivers/gpu/drm/drm_edid.c#L5238-L5240:

define i1 @<!-- -->src(i8 noundef %v13) {
entry:
  %conv1 = zext i8 %v13 to i32
  %add = add nsw i32 %conv1, -4
  %cmp = icmp ult i32 %add, 3
  %cmp4 = icmp slt i8 %v13, 4
  %cond = select i1 %cmp4, i1 true, i1 %cmp
  ret i1 %cond
}

define i1 @<!-- -->tgt(i8 noundef %v13) {
entry:
  %cmp4 = icmp slt i8 %v13, 7
  ret i1 %cmp4
}

Full diff: https://github.com/llvm/llvm-project/pull/110511.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+25)
(modified) llvm/test/Transforms/InstCombine/icmp-add.ll (+91)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 6c3fc987d9add2..0fbf446480d0dc 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -3168,6 +3168,31 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
                         Builder.CreateAdd(X, ConstantInt::get(Ty, *C2 - C - 1)),
                         ConstantInt::get(Ty, ~C));
 
+  // zext(V) + C2 pred C -> V + C3 pred' C4
+  Value *V;
+  if (match(X, m_ZExt(m_Value(V)))) {
+    Type *NewCmpTy = V->getType();
+    unsigned CmpBW = Ty->getScalarSizeInBits();
+    unsigned NewCmpBW = NewCmpTy->getScalarSizeInBits();
+    if (shouldChangeType(Ty, NewCmpTy)) {
+      if (auto ZExtCR = CR.exactIntersectWith(ConstantRange(
+              APInt::getZero(CmpBW), APInt::getOneBitSet(CmpBW, NewCmpBW)))) {
+        ConstantRange SrcCR = ZExtCR->truncate(NewCmpBW);
+        CmpInst::Predicate EquivPred;
+        APInt EquivInt;
+        APInt EquivOffset;
+
+        SrcCR.getEquivalentICmp(EquivPred, EquivInt, EquivOffset);
+        return new ICmpInst(
+            EquivPred,
+            EquivOffset.isZero()
+                ? V
+                : Builder.CreateAdd(V, ConstantInt::get(NewCmpTy, EquivOffset)),
+            ConstantInt::get(NewCmpTy, EquivInt));
+      }
+    }
+  }
+
   return nullptr;
 }
 
diff --git a/llvm/test/Transforms/InstCombine/icmp-add.ll b/llvm/test/Transforms/InstCombine/icmp-add.ll
index 0c141d4b8e73aa..2239e48468ee04 100644
--- a/llvm/test/Transforms/InstCombine/icmp-add.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-add.ll
@@ -3183,3 +3183,94 @@ define i1 @icmp_of_ucmp_plus_const_with_const(i32 %x, i32 %y) {
   %cmp2 = icmp ult i8 %add, 2
   ret i1 %cmp2
 }
+
+define i1 @zext_range_check_ult(i8 %x) {
+; CHECK-LABEL: @zext_range_check_ult(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = add i8 [[X:%.*]], -4
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i8 [[TMP0]], 3
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %conv = zext i8 %x to i32
+  %add = add i32 %conv, -4
+  %cmp = icmp ult i32 %add, 3
+  ret i1 %cmp
+}
+
+; TODO: should be canonicalized to (x - 4) u> 2
+define i1 @zext_range_check_ugt(i8 %x) {
+; CHECK-LABEL: @zext_range_check_ugt(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[CONV:%.*]] = zext i8 [[X:%.*]] to i32
+; CHECK-NEXT:    [[TMP0:%.*]] = add nsw i32 [[CONV]], -7
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i32 [[TMP0]], -3
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %conv = zext i8 %x to i32
+  %add = add i32 %conv, -4
+  %cmp = icmp ugt i32 %add, 2
+  ret i1 %cmp
+}
+
+; TODO: should be canonicalized to (x - 4) u> 2
+define i1 @zext_range_check_ult_alter(i8 %x) {
+; CHECK-LABEL: @zext_range_check_ult_alter(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[CONV:%.*]] = zext i8 [[X:%.*]] to i32
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 [[CONV]], -7
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i32 [[ADD]], -3
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %conv = zext i8 %x to i32
+  %add = add i32 %conv, -7
+  %cmp = icmp ult i32 %add, -3
+  ret i1 %cmp
+}
+
+define i1 @zext_range_check_mergable(i8 %x) {
+; CHECK-LABEL: @zext_range_check_mergable(
+; CHECK-NEXT:    [[COND:%.*]] = icmp slt i8 [[X:%.*]], 7
+; CHECK-NEXT:    ret i1 [[COND]]
+;
+  %conv = zext i8 %x to i32
+  %add = add nsw i32 %conv, -4
+  %cmp1 = icmp ult i32 %add, 3
+  %cmp2 = icmp slt i8 %x, 4
+  %cond = select i1 %cmp2, i1 true, i1 %cmp1
+  ret i1 %cond
+}
+
+; Negative tests
+
+define i1 @sext_range_check_ult(i8 %x) {
+; CHECK-LABEL: @sext_range_check_ult(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[CONV:%.*]] = sext i8 [[X:%.*]] to i32
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 [[CONV]], -4
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i32 [[ADD]], 3
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %conv = sext i8 %x to i32
+  %add = add i32 %conv, -4
+  %cmp = icmp ult i32 %add, 3
+  ret i1 %cmp
+}
+
+define i1 @zext_range_check_ult_illegal_type(i7 %x) {
+; CHECK-LABEL: @zext_range_check_ult_illegal_type(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[CONV:%.*]] = zext i7 [[X:%.*]] to i32
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 [[CONV]], -4
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i32 [[ADD]], 3
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %conv = zext i7 %x to i32
+  %add = add i32 %conv, -4
+  %cmp = icmp ult i32 %add, 3
+  ret i1 %cmp
+}

dtcxzyw · 2024-10-08T00:58:25Z

Is this intentionally still in draft form? Or are you ready for review?

It is ready for review now. I believe I cannot generalize it further without new ConstantRange API.

dtcxzyw requested a review from goldsteinn September 30, 2024 13:46

This was referenced Sep 30, 2024

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

pre-commit: PR110511 dtcxzyw/llvm-opt-benchmark#1400

Closed

pre-commit: PR110511 dtcxzyw/llvm-opt-benchmark#1404

Closed

goldsteinn reviewed Oct 1, 2024

View reviewed changes

This was referenced Oct 1, 2024

pre-commit: PR110511 dtcxzyw/llvm-opt-benchmark#1405

Closed

pre-commit: PR110511 dtcxzyw/llvm-opt-benchmark#1406

Closed

pre-commit: PR110511 dtcxzyw/llvm-opt-benchmark#1407

Closed

dtcxzyw added 3 commits October 6, 2024 14:26

[InstCombine] Add pre-commit tests. NFC.

910923a

[InstCombine] Fold zext(X) + C2 u< C -> X + trunc(C2) u< trunc(C)

e694c48

[InstCombine] Convert to using ConstantRange API

1e8e545

dtcxzyw force-pushed the perf/fold-zext-icmp-offset-c branch from ad6af44 to 1e8e545 Compare October 6, 2024 07:27

dtcxzyw changed the title ~~[InstCombine] Fold zext(X) + C2 u< C -> X + trunc(C2) u< trunc(C)~~ [InstCombine] Fold zext(X) + C2 pred C -> X + C3 pred C4 Oct 6, 2024

dtcxzyw mentioned this pull request Oct 6, 2024

pre-commit: PR110511 dtcxzyw/llvm-opt-benchmark#1450

Closed

dtcxzyw commented Oct 6, 2024

View reviewed changes

dtcxzyw marked this pull request as ready for review October 8, 2024 00:56

dtcxzyw requested a review from nikic as a code owner October 8, 2024 00:56

llvmbot added the llvm:transforms label Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Fold zext(X) + C2 pred C -> X + C3 pred C4 #110511

[InstCombine] Fold zext(X) + C2 pred C -> X + C3 pred C4 #110511

dtcxzyw commented Sep 30, 2024 •

edited

Loading

goldsteinn Oct 1, 2024

dtcxzyw Oct 6, 2024

goldsteinn commented Oct 7, 2024

llvmbot commented Oct 8, 2024

dtcxzyw commented Oct 8, 2024

[InstCombine] Fold zext(X) + C2 pred C -> X + C3 pred C4 #110511

Are you sure you want to change the base?

[InstCombine] Fold zext(X) + C2 pred C -> X + C3 pred C4 #110511

Conversation

dtcxzyw commented Sep 30, 2024 • edited Loading

goldsteinn Oct 1, 2024

Choose a reason for hiding this comment

dtcxzyw Oct 6, 2024

Choose a reason for hiding this comment

goldsteinn commented Oct 7, 2024

llvmbot commented Oct 8, 2024

dtcxzyw commented Oct 8, 2024

dtcxzyw commented Sep 30, 2024 •

edited

Loading