-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Use range
parameter attributes to fold sub
+icmp u*
into icmp s*
#134028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Come to think of it, another way to get to the right place would be to look at the define noundef zeroext i1 @src(i8 noundef range(i8 -1, 5) %x) unnamed_addr #0 {
start:
%0 = add nsw i8 %x, -5
%1 = icmp ult i8 %0, -3
ret i1 %1
} And use the range information to notice define noundef zeroext i1 @tgt(i8 noundef range(i8 -1, 5) %x) unnamed_addr #0 {
start:
%0 = add nsw i8 %x, -5
%1 = icmp samesign slt i8 %0, -3
ret i1 %1
} Which would then |
We can split this into two PRs:
|
Hi! This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below. |
@llvm/issue-subscribers-good-first-issue Author: None (scottmcm)
(This example comes from looking at Rust's discriminant code from <https://rust.godbolt.org/z/1138aGqdb>)
Take this input IR: define noundef zeroext i1 @<!-- -->is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr {
start:
%0 = sub i8 %x, 2
%1 = zext i8 %0 to i64
%2 = icmp ule i8 %0, 2
%3 = add i64 %1, 1
%_2 = select i1 %2, i64 %3, i64 0
%_0 = icmp eq i64 %_2, 0
ret i1 %_0
} Today, LLVM does simplify it a bunch, getting it down to <https://llvm.godbolt.org/z/G7as7Y6o5> define noundef zeroext i1 @<!-- -->is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr #<!-- -->0 {
%0 = add nsw i8 %x, -5
%1 = icmp ult i8 %0, -3
ret i1 %1
} It could do better, though. The range information was used to determine the Specifically, the define noundef zeroext i1 @<!-- -->is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr #<!-- -->0 {
start:
%1 = icmp slt i8 %x, 2
ret i1 %1
} Eliminating the need for the |
@RAJAGOPALAN-GANGADHARAN Please read https://llvm.org/docs/InstCombineContributorGuide.html before submitting a patch. Good luck :) |
@dtcxzyw Thanks for assigning the issue. So I spent couple of hours trying to understand whats going on and I have some questions regarding the first part:
I can see that samesign is added only at O2 or higher, While the issue has used O1. Is the expectation to make this work at O1? From what I understand that is not the right expectation because all the cmp optimizations are done at O2 level. |
Ah, the That said, if I take the repro and change it to O3, I do see that it gets the ~~So maybe its an awkward phase ordering problem? :/ ~~
Oh, I failed to read my own example. Right, InstCombine does fix it when its |
@scottmcm , If I swap the ult to slt while we add the samesign, I can see the optimizer is folding it to icmp slt automatically. This seems like a good fix for me, essentially im checking if both passed numbers are negative -> if yes swap the "u*" with "s*" Let me know what do you think about this? Thanks! |
For any specifics about how to handle thing definitely trust what @dtcxzyw has to say over me, because I do not understand the LLVM internals at any meaningful level. I just look at IR and run Alive2 :P It does sound at least plausible to me to flip to a signed operation if we know that the sign bit is set. That said, I do not know if the usual canonicalization is to unsigned ops or signed ones when both work. It could potentially also be that the pattern should look for the It looks like #134028 (comment) suggested that it be a fold on the |
InstCombine pass canonicalizes |
@dtcxzyw If |
Generally, we should never invert the canonicalization in IR. But we can temporarily refine In this case, the expected transformation happens at: llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Lines 3155 to 3171 in 53e1c8b
The current implementation only expects a slt/sgt if the add has NSW. To make it accept
|
@dtcxzyw Thanks for the pointers :) , I did find this earlier. I was more interested in what cases would denote the switch from slt to samesign ult and vice versa. Now I think I have some idea on why these optimizations are done, essentially in this case there is a comparison between a partially negative range and always positive constant. This means it is a signed comparison thereby overflow is UB, and we are free to optimize on this. Since we are sure atleast one side needs signed representation, this kind of fold is expected. |
(This example comes from looking at Rust's discriminant code from https://rust.godbolt.org/z/1138aGqdb)
Take this input IR:
Today, LLVM does simplify it a bunch, getting it down to https://llvm.godbolt.org/z/G7as7Y6o5
It could do better, though. The range information was used to determine the
nsw
, but that doesn't really help the unsignedicmp
.Specifically, the
range
restriction is enough that it'd be allowed to be just https://alive2.llvm.org/ce/z/fkVEwLEliminating the need for the
add
altogether.The text was updated successfully, but these errors were encountered: