Skip to content

InstCombine: Fold samesign ult to slt with added constant when the range is known #134556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

RAJAGOPALAN-GANGADHARAN
Copy link

@RAJAGOPALAN-GANGADHARAN RAJAGOPALAN-GANGADHARAN commented Apr 6, 2025

Notes

Optimizer at O2 and above can deduct that addition is nsw when the range is known and is partially negative. When this is followed by a icmp samesign ult, it can be folded/reduced into an icmp slt with constant resolved/added during compile time. This optimizes the compare instruction a bit.

Testing

Tested manually in alive2
Wrote unit tests to validate functionality
Test 1 checks when the folding is performed from samesign ult to slt on a partially negative range
Test 2 makes sure existing functionality is not broken by validating a completely positive range.

Closes #134028

Copy link

github-actions bot commented Apr 6, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Apr 6, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Rajagopalan Gangadharan (RAJAGOPALAN-GANGADHARAN)

Changes

Notes

Optimizer at O2 and above can deduct that addition is nsw when the range is known and is partially negative. When this is followed by a icmp samesign ult, it can be folded/reduced into an icmp slt with constant resolved/added during compile time. This optimizes the compare instruction a bit.

Testing

Tested manually in alive2
Wrote unit tests to validate functionality
Test 1 checks when the folding is performed from samesign ult to slt on a partially negative range
Test 2 makes sure existing functionality is not broken by validating a completely positive range.

Issue:

This PR fixes issue: #134208


Full diff: https://github.com/llvm/llvm-project/pull/134556.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+9-4)
  • (modified) llvm/test/Transforms/InstCombine/icmp-add.ll (+24)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index e75b4026d5424..863b59454b478 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -3119,7 +3119,7 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
 
   Value *Op0, *Op1;
   Instruction *Ext0, *Ext1;
-  const CmpInst::Predicate Pred = Cmp.getPredicate();
+  const CmpPredicate Pred = Cmp.getCmpPredicate();
   if (match(Add,
             m_Add(m_CombineAnd(m_Instruction(Ext0), m_ZExtOrSExt(m_Value(Op0))),
                   m_CombineAnd(m_Instruction(Ext1),
@@ -3156,7 +3156,8 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
   // the constants. Equality comparisons are handled elsewhere. SGE/SLE/UGE/ULE
   // are canonicalized to SGT/SLT/UGT/ULT.
   if ((Add->hasNoSignedWrap() &&
-       (Pred == ICmpInst::ICMP_SGT || Pred == ICmpInst::ICMP_SLT)) ||
+       (Pred.getPreferredSignedPredicate() == ICmpInst::ICMP_SGT ||
+        Pred.getPreferredSignedPredicate() == ICmpInst::ICMP_SLT)) ||
       (Add->hasNoUnsignedWrap() &&
        (Pred == ICmpInst::ICMP_UGT || Pred == ICmpInst::ICMP_ULT))) {
     bool Overflow;
@@ -3165,9 +3166,13 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
     // If there is overflow, the result must be true or false.
     // TODO: Can we assert there is no overflow because InstSimplify always
     // handles those cases?
-    if (!Overflow)
+    if (!Overflow) {
+      const CmpInst::Predicate EquivPredicate =
+          Add->hasNoSignedWrap() ? Pred.getPreferredSignedPredicate()
+                                 : Cmp.getPredicate();
       // icmp Pred (add nsw X, C2), C --> icmp Pred X, (C - C2)
-      return new ICmpInst(Pred, X, ConstantInt::get(Ty, NewC));
+      return new ICmpInst(EquivPredicate, X, ConstantInt::get(Ty, NewC));
+    }
   }
 
   if (ICmpInst::isUnsigned(Pred) && Add->hasNoSignedWrap() &&
diff --git a/llvm/test/Transforms/InstCombine/icmp-add.ll b/llvm/test/Transforms/InstCombine/icmp-add.ll
index a8cdf80948a84..eb693244f2057 100644
--- a/llvm/test/Transforms/InstCombine/icmp-add.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-add.ll
@@ -3302,3 +3302,27 @@ entry:
   %cmp = icmp ult i32 %add, 253
   ret i1 %cmp
 }
+
+define i1 @icmp_partial_negative_samesign_ult_folded_to_slt(i8 range(i8 -1, 5) %x) {
+; CHECK-LABEL: @icmp_partial_negative_samesign_ult_folded_to_slt(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i8 [[X:%.*]], 2
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %add = add nsw i8 %x, -5
+  %cmp = icmp samesign ult i8 %add, -3
+  ret i1 %cmp
+}
+
+define i1 @icmp_positive_samesign_ult_folded_to_ult(i8 range(i8 1, 5) %x) {
+; CHECK-LABEL: @icmp_positive_samesign_ult_folded_to_ult(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[CMP:%.*]] = icmp samesign ult i8 [[X:%.*]], 2
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %add = add nsw i8 %x, 1
+  %cmp = icmp samesign slt i8 %add, 3
+  ret i1 %cmp
+}

@nikic nikic requested a review from dtcxzyw April 6, 2025 19:07
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixes issue: #134208

This links to an unrelated PR.

Please include generalized alive2 proofs in the PR description, see https://llvm.org/docs/InstCombineContributorGuide.html#proofs.

@@ -3165,9 +3166,13 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
// If there is overflow, the result must be true or false.
// TODO: Can we assert there is no overflow because InstSimplify always
// handles those cases?
if (!Overflow)
if (!Overflow) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incorrect as implemented, because the isSigned() check above still uses the original predicate. It would be good to add a test for a case where the ssub_ov vs usub_ov distinction is relevant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall implementation approach here is iffy, because for the nuw+nsw+samesign case we could use both the signed and unsigned predicate, so we should try both, with a preference for unsigned.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You are right, let me fix it. Will also add a test for the overflow cases.
  2. I don't get the second comment, can an instruction have both nsw and nuw at the same time how would that work?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting just noticed the tests related to nuw and nsw failing, let me check on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the second comment, can an instruction have both nsw and nuw at the same time how would that work?

Yes, you can have both nsw and nuw on an instruction. It's pretty common. E.g. add i32 1, 1 is both nuw and nsw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use range parameter attributes to fold sub+icmp u* into icmp s*
3 participants