-
Notifications
You must be signed in to change notification settings - Fork 13.4k
InstCombine: Fold samesign ult to slt with added constant when the range is known #134556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
InstCombine: Fold samesign ult to slt with added constant when the range is known #134556
Conversation
Signed-off-by: Rajagopalan-Gangadharan <[email protected]>
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-llvm-transforms Author: Rajagopalan Gangadharan (RAJAGOPALAN-GANGADHARAN) ChangesNotesOptimizer at O2 and above can deduct that addition is TestingTested manually in alive2 Issue:This PR fixes issue: #134208 Full diff: https://github.com/llvm/llvm-project/pull/134556.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index e75b4026d5424..863b59454b478 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -3119,7 +3119,7 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
Value *Op0, *Op1;
Instruction *Ext0, *Ext1;
- const CmpInst::Predicate Pred = Cmp.getPredicate();
+ const CmpPredicate Pred = Cmp.getCmpPredicate();
if (match(Add,
m_Add(m_CombineAnd(m_Instruction(Ext0), m_ZExtOrSExt(m_Value(Op0))),
m_CombineAnd(m_Instruction(Ext1),
@@ -3156,7 +3156,8 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
// the constants. Equality comparisons are handled elsewhere. SGE/SLE/UGE/ULE
// are canonicalized to SGT/SLT/UGT/ULT.
if ((Add->hasNoSignedWrap() &&
- (Pred == ICmpInst::ICMP_SGT || Pred == ICmpInst::ICMP_SLT)) ||
+ (Pred.getPreferredSignedPredicate() == ICmpInst::ICMP_SGT ||
+ Pred.getPreferredSignedPredicate() == ICmpInst::ICMP_SLT)) ||
(Add->hasNoUnsignedWrap() &&
(Pred == ICmpInst::ICMP_UGT || Pred == ICmpInst::ICMP_ULT))) {
bool Overflow;
@@ -3165,9 +3166,13 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
// If there is overflow, the result must be true or false.
// TODO: Can we assert there is no overflow because InstSimplify always
// handles those cases?
- if (!Overflow)
+ if (!Overflow) {
+ const CmpInst::Predicate EquivPredicate =
+ Add->hasNoSignedWrap() ? Pred.getPreferredSignedPredicate()
+ : Cmp.getPredicate();
// icmp Pred (add nsw X, C2), C --> icmp Pred X, (C - C2)
- return new ICmpInst(Pred, X, ConstantInt::get(Ty, NewC));
+ return new ICmpInst(EquivPredicate, X, ConstantInt::get(Ty, NewC));
+ }
}
if (ICmpInst::isUnsigned(Pred) && Add->hasNoSignedWrap() &&
diff --git a/llvm/test/Transforms/InstCombine/icmp-add.ll b/llvm/test/Transforms/InstCombine/icmp-add.ll
index a8cdf80948a84..eb693244f2057 100644
--- a/llvm/test/Transforms/InstCombine/icmp-add.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-add.ll
@@ -3302,3 +3302,27 @@ entry:
%cmp = icmp ult i32 %add, 253
ret i1 %cmp
}
+
+define i1 @icmp_partial_negative_samesign_ult_folded_to_slt(i8 range(i8 -1, 5) %x) {
+; CHECK-LABEL: @icmp_partial_negative_samesign_ult_folded_to_slt(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[CMP:%.*]] = icmp slt i8 [[X:%.*]], 2
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %add = add nsw i8 %x, -5
+ %cmp = icmp samesign ult i8 %add, -3
+ ret i1 %cmp
+}
+
+define i1 @icmp_positive_samesign_ult_folded_to_ult(i8 range(i8 1, 5) %x) {
+; CHECK-LABEL: @icmp_positive_samesign_ult_folded_to_ult(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[CMP:%.*]] = icmp samesign ult i8 [[X:%.*]], 2
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %add = add nsw i8 %x, 1
+ %cmp = icmp samesign slt i8 %add, 3
+ ret i1 %cmp
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR fixes issue: #134208
This links to an unrelated PR.
Please include generalized alive2 proofs in the PR description, see https://llvm.org/docs/InstCombineContributorGuide.html#proofs.
@@ -3165,9 +3166,13 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp, | |||
// If there is overflow, the result must be true or false. | |||
// TODO: Can we assert there is no overflow because InstSimplify always | |||
// handles those cases? | |||
if (!Overflow) | |||
if (!Overflow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks incorrect as implemented, because the isSigned() check above still uses the original predicate. It would be good to add a test for a case where the ssub_ov vs usub_ov distinction is relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall implementation approach here is iffy, because for the nuw+nsw+samesign case we could use both the signed and unsigned predicate, so we should try both, with a preference for unsigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- You are right, let me fix it. Will also add a test for the overflow cases.
- I don't get the second comment, can an instruction have both nsw and nuw at the same time how would that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting just noticed the tests related to nuw and nsw failing, let me check on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the second comment, can an instruction have both nsw and nuw at the same time how would that work?
Yes, you can have both nsw and nuw on an instruction. It's pretty common. E.g. add i32 1, 1 is both nuw and nsw.
Notes
Optimizer at O2 and above can deduct that addition is
nsw
when the range is known and is partially negative. When this is followed by a icmp samesign ult, it can be folded/reduced into an icmp slt with constant resolved/added during compile time. This optimizes the compare instruction a bit.Testing
Tested manually in alive2
Wrote unit tests to validate functionality
Test 1 checks when the folding is performed from samesign ult to slt on a partially negative range
Test 2 makes sure existing functionality is not broken by validating a completely positive range.
Closes #134028