-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DA] remove wrap-around check from affine definition #116632
Conversation
@llvm/pr-subscribers-testing-tools @llvm/pr-subscribers-llvm-analysis Author: Sebastian Pop (sebpop) ChangesDA uses SCEV to solve linear constraints. When it generates SCEVs with negative strides, i.e., {0,+,-1}, make sure the SCEVs are marked as non wrap arithmetic. This patch fixes #51512 Full diff: https://github.com/llvm/llvm-project/pull/116632.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index a4a98ea0bae146..f08e4325a0bc75 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3116,11 +3116,9 @@ const SCEV *DependenceInfo::addToCoefficient(const SCEV *Expr,
const Loop *TargetLoop,
const SCEV *Value) const {
const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(Expr);
- if (!AddRec) // create a new addRec
- return SE->getAddRecExpr(Expr,
- Value,
- TargetLoop,
- SCEV::FlagAnyWrap); // Worst case, with no info.
+ if (!AddRec)
+ return SE->getAddRecExpr(Expr, Value, TargetLoop, SCEV::FlagNSW);
+
if (AddRec->getLoop() == TargetLoop) {
const SCEV *Sum = SE->getAddExpr(AddRec->getStepRecurrence(*SE), Value);
if (Sum->isZero())
@@ -3131,7 +3129,7 @@ const SCEV *DependenceInfo::addToCoefficient(const SCEV *Expr,
AddRec->getNoWrapFlags());
}
if (SE->isLoopInvariant(AddRec, TargetLoop))
- return SE->getAddRecExpr(AddRec, Value, TargetLoop, SCEV::FlagAnyWrap);
+ return SE->getAddRecExpr(AddRec, Value, TargetLoop, SCEV::FlagNSW);
return SE->getAddRecExpr(
addToCoefficient(AddRec->getStart(), TargetLoop, Value),
AddRec->getStepRecurrence(*SE), AddRec->getLoop(),
diff --git a/llvm/test/Analysis/DependenceAnalysis/PR51512.ll b/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
new file mode 100644
index 00000000000000..52419adff8f8b5
--- /dev/null
+++ b/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
@@ -0,0 +1,34 @@
+; RUN: opt < %s -disable-output "-passes=print<da>" -aa-pipeline=basic-aa
+
+; Check that the testcase does not crash the compiler.
+; https://github.com/llvm/llvm-project/issues/51512
+
+define void @func_1() {
+entry:
+ %l_83.i.i = alloca [2 x [5 x i32]], align 1
+ br label %for.cond857.preheader.i.i
+
+for.cond857.preheader.i.i: ; preds = %cleanup.cont1138.i.i, %entry
+ %l_89.08.i.i = phi i32 [ 0, %entry ], [ %add1140.i.i, %cleanup.cont1138.i.i ]
+ %0 = trunc i32 %l_89.08.i.i to i16
+ %1 = add i16 %0, 3
+ %arrayidx916.i.i = getelementptr inbounds [2 x [5 x i32]], [2 x [5 x i32]]* %l_83.i.i, i16 0, i16 %0, i16 %1
+ br label %for.body860.i.i
+
+for.body860.i.i: ; preds = %for.body860.i.i, %for.cond857.preheader.i.i
+ %l_74.07.i.i = phi i32 [ 0, %for.cond857.preheader.i.i ], [ %add964.i.i, %for.body860.i.i ]
+ store i32 undef, i32* %arrayidx916.i.i, align 1
+ %2 = trunc i32 %l_74.07.i.i to i16
+ %arrayidx962.i.i = getelementptr inbounds [2 x [5 x i32]], [2 x [5 x i32]]* %l_83.i.i, i16 0, i16 %2, i16 %1
+ store i32 0, i32* %arrayidx962.i.i, align 1
+ %add964.i.i = add nuw nsw i32 %l_74.07.i.i, 1
+ br i1 false, label %for.body860.i.i, label %cleanup.cont1138.i.i
+
+cleanup.cont1138.i.i: ; preds = %for.body860.i.i
+ %add1140.i.i = add nuw nsw i32 %l_89.08.i.i, 1
+ %cmp602.i.i = icmp eq i32 %l_89.08.i.i, 0
+ br i1 %cmp602.i.i, label %for.cond857.preheader.i.i, label %for.cond1480.i.i.preheader
+
+for.cond1480.i.i.preheader: ; preds = %cleanup.cont1138.i.i
+ unreachable
+}
|
TargetLoop, | ||
SCEV::FlagAnyWrap); // Worst case, with no info. | ||
if (!AddRec) | ||
return SE->getAddRecExpr(Expr, Value, TargetLoop, SCEV::FlagNSW); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it safe to assume NSW without any additional information here?
Also, I think as @nikic mentioned, at the moment, we would need to prove that NSW holds in all contexts where the new AddRec is valid
There's #91961 to extend SCEV to allow constructing SCEV's with context-sensitive wrapping flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it safe to assume NSW without any additional information here?
Here's the additional info.
The AddRec created here is only used in the dependence test. It does not represent the evolution of a scalar variable from the compiled code. The AddRec is used to represent the remaining dependence relation after successive projections. The intended semantics is that of a linear function (i.e., a function that does not wrap around.) Adding wrap-around semantics to the SCEV makes the relation non-linear. classifyPair then return Subscript::NonLinear;
and this breaks the dependence test as the non-linear access comes as a surprise this late in the test: all non-linear memory access functions have been filtered out by this time.
we would need to prove that NSW holds in all contexts where the new AddRec is valid
The AddRec created in here is only valid in the data dependence test.
The AddRec is used as a representation of the dependence relation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all non-linear memory access functions have been filtered out by this time.
Where does this happen? If it already ensured that no wrap happens it should be safe.
If I am nitpicky, if the original offset expression is NSW, it would be still possible that intermediate represetions use for dependencies could wrap. E.g. two loops, one with large prime stride p, the other with large prime stride q. They have a theoretical dependence at p*q which might be outside integer precision.
Consider adding a comment into the code why NSW is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Michael for your review.
I updated the commit message to explain that NSW is used to have the semantics of linear function with no wrap around:
DA uses SCEVs to represent linear constraints on memory accesses.
addToCoefficient is only called from the "Constraint Propagation engine":
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/DependenceAnalysis.cpp#L3877The propagation engine works on MIV subscripts.
MIV subscripts are pairs of linear memory accesses that vary in multiple loops.
At this point all memory accesses are linear, non-linear have been filtered out.When addToCoefficient creates an AddRec with flags SCEV::FlagAnyWrap,
classifyPair returns llvm::DependenceInfo::Subscript::NonLinear.
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/DependenceAnalysis.cpp#L3882Having a non-linear effect this late in the dependence test is surprising, and
the test fails with llvm_unreachable("bad subscript classification").Instead of building non-linear functions, this patch modifies the flags to
SCEV::FlagNSW that has the meaning of linear functions that do not wrap around.
I also added a comment to the code:
// When Expr is invariant in TargetLoop, i.e., the stride in TargetLoop is 0,
// addToCoefficient returns an AddRec {Expr, +, Value}_TargetLoop.
// NoWrapFlags is set to FlagNSW for the evolution function to be linear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NoWrapFlags is set to FlagNSW for the evolution function to be linear.
The NSW flag doesn't make it linear in the mathematical sense, which is impossible on a finite domain. The SCEV still has a type limited to some integer bitwidth, and ScalarEvolution makes use of the fact, e.g. when simplifying expressions.
But if the math behind MIV subscripts are based on linear(/affine) functions on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the check for affine happens at the beginning of the dependence test, and then during the dep test.
dep test crashes because new non-affine AddRecs are added to the constraints.
The patch does modify the 2 places that build the new AddRecs to follow the definition of affine in checkSubscript.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code defining "affine" was added as a fix here: c0661ae
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both tests pass after I partial revert c0661ae the check for wrap flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!AddRec->getNoWrapFlags())
passes for any wrap flag, including NW
and NUW
, not necessarily NSW
. So this might re-interpret an unsigned integer as a signed one?
The latest push doesn't even unclude setting SCEV::FlagNSW
anymore. Intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a work in progress patch to enable loop-interchange by default and is a continuation of the RFC: https://discourse.llvm.org/t/enabling-loop-interchange/82589 Basically, we promised to fix any compile-time and correctness issues in the different components involved here (loop-interchange and dependence analaysis.) before discussing enabling interchange by default. We think are close to complete this; I would like to explain where we are and wanted to check if there are any thoughts or concerns. A quick overview of the correctness and compile-time improvements that we have made include: Correctness: - [LoopInterchange] Remove 'S' Scalar Dependencies (llvm#119345) - [LoopInterchange] Fix overflow in cost calculation (llvm#111807) - [LoopInterchange] Handle LE and GE correctly (PR llvm#124901) @kasuga-fj - [DA] disambiguate evolution of base addresses (llvm#116628) Compile-times: - [LoopInterchange] Constrain number of load/stores in a loop (llvm#118973) - [LoopInterchange] Bail out early if minimum loop nest is not met (llvm#115128) - [LoopInterchange] Hoist isComputableLoopNest() in the control flow (llvm#124247) And in terms of remaining work, we think we are very close to fixing these depenence analysis issues: - [DA] do not handle array accesses of different offsets (llvm#123436) - [DA] Dependence analysis does not handle array accesses of different sizes (llvm#116630) - [DA] use NSW arithmetic llvm#116632 The compile-time increase with a geomean increase of 0.19% looks good (after committing llvm#124247), I think: stage1-O3: Benchmark kimwitu++ +0.10% sqlite3 +0.14% consumer-typeset +0.07% Bullet +0.06% tramp3d-v4 +0.21% mafft +0.39% ClamAVi +0.06% lencod +0.61% SPASS +0.17% 7zip +0.08% geomean +0.19% See also: http://llvm-compile-time-tracker.com/compare.php?from=19a7fe03b4f58c4f73ea91d5e63bc4c6e61f987b&to=b24f1367d68ee675ea93ecda4939208c6b68ae4b&stat=instructions%3Au We might want to look into lencod to see if we can improve more, but not sure it is strictly necessary.
04ec5a0
to
e575d9d
Compare
Can you update the description? "make sure the SCEVs are marked as non wrap arithmetic" doesn't match to the current code diff. For adding NSW, I crafted some example:
there should be a RAW dependencies from (i==0 to i==0x80000001). The AddRec for |
@Meinersbur any other comments on this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to update the description explaining the latest version of the fix and why it is safe to remove the check
%alloca = alloca [2 x [5 x i32]], align 1 | ||
br label %bb1 | ||
|
||
bb1: ; preds = %bb7, %bb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to have more descriptive names for the block make the test easier to read, e.g. bb1 -> loop.header,
, `bb7 -> loop.latch).
br label %bb1 | ||
|
||
bb1: ; preds = %bb7, %bb | ||
%phi = phi i32 [ 0, %bb ], [ %add8, %bb7 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be clearer to use more descriptive names, e..g
%phi = phi i32 [ 0, %bb ], [ %add8, %bb7 ] | |
%iv = phi i32 [ 0, %bb ], [ %iv.next, %bb7 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please also consider @fhahn's comments
Fix llvm#51512 by reverting a part of commit llvm@c0661ae that modified the definition affine subscripts.
DA uses SCEV to solve linear constraints. When it generates SCEVs with negative strides, i.e., {0,+,-1}, make sure the SCEVs are marked as non wrap arithmetic. This patch fixes #51512