[DA] remove wrap-around check from affine definition #116632

sebpop · 2024-11-18T14:27:13Z

DA uses SCEV to solve linear constraints. When it generates SCEVs with negative strides, i.e., {0,+,-1}, make sure the SCEVs are marked as non wrap arithmetic. This patch fixes #51512

llvmbot · 2024-11-18T14:27:50Z

@llvm/pr-subscribers-testing-tools

@llvm/pr-subscribers-llvm-analysis

Author: Sebastian Pop (sebpop)

Changes

DA uses SCEV to solve linear constraints. When it generates SCEVs with negative strides, i.e., {0,+,-1}, make sure the SCEVs are marked as non wrap arithmetic. This patch fixes #51512

Full diff: https://github.com/llvm/llvm-project/pull/116632.diff

2 Files Affected:

(modified) llvm/lib/Analysis/DependenceAnalysis.cpp (+4-6)
(added) llvm/test/Analysis/DependenceAnalysis/PR51512.ll (+34)

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index a4a98ea0bae146..f08e4325a0bc75 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3116,11 +3116,9 @@ const SCEV *DependenceInfo::addToCoefficient(const SCEV *Expr,
                                              const Loop *TargetLoop,
                                              const SCEV *Value) const {
   const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(Expr);
-  if (!AddRec) // create a new addRec
-    return SE->getAddRecExpr(Expr,
-                             Value,
-                             TargetLoop,
-                             SCEV::FlagAnyWrap); // Worst case, with no info.
+  if (!AddRec)
+    return SE->getAddRecExpr(Expr, Value, TargetLoop, SCEV::FlagNSW);
+
   if (AddRec->getLoop() == TargetLoop) {
     const SCEV *Sum = SE->getAddExpr(AddRec->getStepRecurrence(*SE), Value);
     if (Sum->isZero())
@@ -3131,7 +3129,7 @@ const SCEV *DependenceInfo::addToCoefficient(const SCEV *Expr,
                              AddRec->getNoWrapFlags());
   }
   if (SE->isLoopInvariant(AddRec, TargetLoop))
-    return SE->getAddRecExpr(AddRec, Value, TargetLoop, SCEV::FlagAnyWrap);
+    return SE->getAddRecExpr(AddRec, Value, TargetLoop, SCEV::FlagNSW);
   return SE->getAddRecExpr(
       addToCoefficient(AddRec->getStart(), TargetLoop, Value),
       AddRec->getStepRecurrence(*SE), AddRec->getLoop(),
diff --git a/llvm/test/Analysis/DependenceAnalysis/PR51512.ll b/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
new file mode 100644
index 00000000000000..52419adff8f8b5
--- /dev/null
+++ b/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
@@ -0,0 +1,34 @@
+; RUN: opt < %s -disable-output "-passes=print<da>" -aa-pipeline=basic-aa
+
+; Check that the testcase does not crash the compiler.
+; https://github.com/llvm/llvm-project/issues/51512
+
+define void @func_1() {
+entry:
+  %l_83.i.i = alloca [2 x [5 x i32]], align 1
+  br label %for.cond857.preheader.i.i
+
+for.cond857.preheader.i.i:                        ; preds = %cleanup.cont1138.i.i, %entry
+  %l_89.08.i.i = phi i32 [ 0, %entry ], [ %add1140.i.i, %cleanup.cont1138.i.i ]
+  %0 = trunc i32 %l_89.08.i.i to i16
+  %1 = add i16 %0, 3
+  %arrayidx916.i.i = getelementptr inbounds [2 x [5 x i32]], [2 x [5 x i32]]* %l_83.i.i, i16 0, i16 %0, i16 %1
+  br label %for.body860.i.i
+
+for.body860.i.i:                                  ; preds = %for.body860.i.i, %for.cond857.preheader.i.i
+  %l_74.07.i.i = phi i32 [ 0, %for.cond857.preheader.i.i ], [ %add964.i.i, %for.body860.i.i ]
+  store i32 undef, i32* %arrayidx916.i.i, align 1
+  %2 = trunc i32 %l_74.07.i.i to i16
+  %arrayidx962.i.i = getelementptr inbounds [2 x [5 x i32]], [2 x [5 x i32]]* %l_83.i.i, i16 0, i16 %2, i16 %1
+  store i32 0, i32* %arrayidx962.i.i, align 1
+  %add964.i.i = add nuw nsw i32 %l_74.07.i.i, 1
+  br i1 false, label %for.body860.i.i, label %cleanup.cont1138.i.i
+
+cleanup.cont1138.i.i:                             ; preds = %for.body860.i.i
+  %add1140.i.i = add nuw nsw i32 %l_89.08.i.i, 1
+  %cmp602.i.i = icmp eq i32 %l_89.08.i.i, 0
+  br i1 %cmp602.i.i, label %for.cond857.preheader.i.i, label %for.cond1480.i.i.preheader
+
+for.cond1480.i.i.preheader:                       ; preds = %cleanup.cont1138.i.i
+  unreachable
+}

llvm/test/Analysis/DependenceAnalysis/PR51512.ll

fhahn · 2024-11-18T15:19:27Z

llvm/lib/Analysis/DependenceAnalysis.cpp

-                             TargetLoop,
-                             SCEV::FlagAnyWrap); // Worst case, with no info.
+  if (!AddRec)
+    return SE->getAddRecExpr(Expr, Value, TargetLoop, SCEV::FlagNSW);


Why is it safe to assume NSW without any additional information here?

Also, I think as @nikic mentioned, at the moment, we would need to prove that NSW holds in all contexts where the new AddRec is valid

There's #91961 to extend SCEV to allow constructing SCEV's with context-sensitive wrapping flags.

Why is it safe to assume NSW without any additional information here?

Here's the additional info.
The AddRec created here is only used in the dependence test. It does not represent the evolution of a scalar variable from the compiled code. The AddRec is used to represent the remaining dependence relation after successive projections. The intended semantics is that of a linear function (i.e., a function that does not wrap around.) Adding wrap-around semantics to the SCEV makes the relation non-linear. classifyPair then return Subscript::NonLinear; and this breaks the dependence test as the non-linear access comes as a surprise this late in the test: all non-linear memory access functions have been filtered out by this time.

we would need to prove that NSW holds in all contexts where the new AddRec is valid

The AddRec created in here is only valid in the data dependence test.
The AddRec is used as a representation of the dependence relation.

all non-linear memory access functions have been filtered out by this time.

Where does this happen? If it already ensured that no wrap happens it should be safe.

If I am nitpicky, if the original offset expression is NSW, it would be still possible that intermediate represetions use for dependencies could wrap. E.g. two loops, one with large prime stride p, the other with large prime stride q. They have a theoretical dependence at p*q which might be outside integer precision.

Consider adding a comment into the code why NSW is OK.

Thank you Michael for your review.
I updated the commit message to explain that NSW is used to have the semantics of linear function with no wrap around:

DA uses SCEVs to represent linear constraints on memory accesses.
addToCoefficient is only called from the "Constraint Propagation engine":
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/DependenceAnalysis.cpp#L3877

The propagation engine works on MIV subscripts.
MIV subscripts are pairs of linear memory accesses that vary in multiple loops.
At this point all memory accesses are linear, non-linear have been filtered out.

When addToCoefficient creates an AddRec with flags SCEV::FlagAnyWrap,
classifyPair returns llvm::DependenceInfo::Subscript::NonLinear.
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/DependenceAnalysis.cpp#L3882

Having a non-linear effect this late in the dependence test is surprising, and
the test fails with llvm_unreachable("bad subscript classification").

Instead of building non-linear functions, this patch modifies the flags to
SCEV::FlagNSW that has the meaning of linear functions that do not wrap around.

I also added a comment to the code:

// When Expr is invariant in TargetLoop, i.e., the stride in TargetLoop is 0,
// addToCoefficient returns an AddRec {Expr, +, Value}_TargetLoop.
// NoWrapFlags is set to FlagNSW for the evolution function to be linear.

NoWrapFlags is set to FlagNSW for the evolution function to be linear.

The NSW flag doesn't make it linear in the mathematical sense, which is impossible on a finite domain. The SCEV still has a type limited to some integer bitwidth, and ScalarEvolution makes use of the fact, e.g. when simplifying expressions.

But if the math behind MIV subscripts are based on linear(/affine) functions on $\mathbb{R}^n \to \mathbb{R}$ with unbounded range (red race car book, p129) means that any wrapping behaviour of $a^+l$ and $a^-u$ would also not be correct indicated by the "bad subscript classification", maybe even less so.

Yes, the check for affine happens at the beginning of the dependence test, and then during the dep test.
dep test crashes because new non-affine AddRecs are added to the constraints.
The patch does modify the 2 places that build the new AddRecs to follow the definition of affine in checkSubscript.

The code defining "affine" was added as a fix here: c0661ae

Both tests pass after I partial revert c0661ae the check for wrap flags.

if (!AddRec->getNoWrapFlags()) passes for any wrap flag, including NW and NUW, not necessarily NSW. So this might re-interpret an unsigned integer as a signed one?

The latest push doesn't even unclude setting SCEV::FlagNSW anymore. Intended?

Yes, the change is intended. The last push commit message:

Revert part of c0661ae
that modified the definition of what an affine subscript is.

The code added to checkSubscripts for bug #21959 is not needed.

llvm/test/Analysis/DependenceAnalysis/PR51512.ll

@kasuga-fj

This is a work in progress patch to enable loop-interchange by default and is a continuation of the RFC: https://discourse.llvm.org/t/enabling-loop-interchange/82589 Basically, we promised to fix any compile-time and correctness issues in the different components involved here (loop-interchange and dependence analaysis.) before discussing enabling interchange by default. We think are close to complete this; I would like to explain where we are and wanted to check if there are any thoughts or concerns. A quick overview of the correctness and compile-time improvements that we have made include: Correctness: - [LoopInterchange] Remove 'S' Scalar Dependencies (llvm#119345) - [LoopInterchange] Fix overflow in cost calculation (llvm#111807) - [LoopInterchange] Handle LE and GE correctly (PR llvm#124901) @kasuga-fj - [DA] disambiguate evolution of base addresses (llvm#116628) Compile-times: - [LoopInterchange] Constrain number of load/stores in a loop (llvm#118973) - [LoopInterchange] Bail out early if minimum loop nest is not met (llvm#115128) - [LoopInterchange] Hoist isComputableLoopNest() in the control flow (llvm#124247) And in terms of remaining work, we think we are very close to fixing these depenence analysis issues: - [DA] do not handle array accesses of different offsets (llvm#123436) - [DA] Dependence analysis does not handle array accesses of different sizes (llvm#116630) - [DA] use NSW arithmetic llvm#116632 The compile-time increase with a geomean increase of 0.19% looks good (after committing llvm#124247), I think: stage1-O3: Benchmark kimwitu++ +0.10% sqlite3 +0.14% consumer-typeset +0.07% Bullet +0.06% tramp3d-v4 +0.21% mafft +0.39% ClamAVi +0.06% lencod +0.61% SPASS +0.17% 7zip +0.08% geomean +0.19% See also: http://llvm-compile-time-tracker.com/compare.php?from=19a7fe03b4f58c4f73ea91d5e63bc4c6e61f987b&to=b24f1367d68ee675ea93ecda4939208c6b68ae4b&stat=instructions%3Au We might want to look into lencod to see if we can improve more, but not sure it is strictly necessary.

Meinersbur · 2025-02-07T18:45:21Z

Can you update the description? "make sure the SCEVs are marked as non wrap arithmetic" doesn't match to the current code diff.

For adding NSW, I crafted some example:

for (uint32_t i = 0x02; i < min((uint32_t)n,2); i+= 0x7FFFFFFF) {
   if (i == 0x02)
    A[0x80000001] = 42;
  else if (i == 0x80000001)
    use(A[i]);
}

there should be a RAW dependencies from (i==0 to i==0x80000001). The AddRec for i should be marked as non-self-wrapping (and NUW) due to the restricted trip count. If we create a new AddRec for the subscript of A[i] with NSW, SCEV might the only valid value for i is i==0, because for i==1 the value 0x02 + 1*0x7FFFFFFF is wrapping a signed integer. DependenceInfo therefore might conclude there is no dependency ⚡

sebpop · 2025-02-20T15:53:51Z

@Meinersbur any other comments on this patch?
Ok to merge?

fhahn

It would be good to update the description explaining the latest version of the fix and why it is safe to remove the check

fhahn · 2025-02-20T16:07:32Z

llvm/test/Analysis/DependenceAnalysis/PR51512.ll

+  %alloca = alloca [2 x [5 x i32]], align 1
+  br label %bb1
+
+bb1:                                              ; preds = %bb7, %bb


Would be good to have more descriptive names for the block make the test easier to read, e.g. bb1 -> loop.header,, `bb7 -> loop.latch).

fhahn · 2025-02-20T16:07:59Z

llvm/test/Analysis/DependenceAnalysis/PR51512.ll

+  br label %bb1
+
+bb1:                                              ; preds = %bb7, %bb
+  %phi = phi i32 [ 0, %bb ], [ %add8, %bb7 ]


would be clearer to use more descriptive names, e..g

Suggested change

%phi = phi i32 [ 0, %bb ], [ %add8, %bb7 ]

%iv = phi i32 [ 0, %bb ], [ %iv.next, %bb7 ]

Meinersbur

LGTM

Please also consider @fhahn's comments

Fix llvm#51512 by reverting a part of commit llvm@c0661ae that modified the definition affine subscripts.

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Nov 18, 2024

sebpop mentioned this pull request Nov 18, 2024

Bug fixes for Array Dependence Analysis #116389

Closed

sjoerdmeijer reviewed Nov 18, 2024

View reviewed changes

llvm/test/Analysis/DependenceAnalysis/PR51512.ll Show resolved Hide resolved

sjoerdmeijer requested review from nikic and Meinersbur November 18, 2024 15:14

fhahn reviewed Nov 18, 2024

View reviewed changes

llvm/test/Analysis/DependenceAnalysis/PR51512.ll Outdated Show resolved Hide resolved

fhahn reviewed Nov 18, 2024

View reviewed changes

sebpop force-pushed the da-4 branch from 2acdebf to fc516d3 Compare November 26, 2024 03:41

llvmbot added the testing-tools label Nov 26, 2024

sjoerdmeijer mentioned this pull request Jan 29, 2025

[LoopInterchange] Enable it by default #124911

Open

sebpop force-pushed the da-4 branch 2 times, most recently from 04ec5a0 to e575d9d Compare February 5, 2025 19:03

sebpop force-pushed the da-4 branch from e575d9d to 11aaf58 Compare February 6, 2025 19:53

sebpop force-pushed the da-4 branch from 11aaf58 to baf23e1 Compare February 11, 2025 03:05

sebpop changed the title ~~[DA] use NSW arithmetic~~ [DA] remove wrap-around check from affine definition Feb 18, 2025

fhahn reviewed Feb 20, 2025

View reviewed changes

Meinersbur approved these changes Feb 20, 2025

View reviewed changes

[DA] remove wrap-around check from affine definition

e20a742

Fix llvm#51512 by reverting a part of commit llvm@c0661ae that modified the definition affine subscripts.

sebpop force-pushed the da-4 branch from baf23e1 to e20a742 Compare February 21, 2025 22:22

sebpop merged commit 5f8da7e into llvm:main Feb 22, 2025
11 checks passed

sebpop deleted the da-4 branch February 22, 2025 00:18

	%phi = phi i32 [ 0, %bb ], [ %add8, %bb7 ]
	%iv = phi i32 [ 0, %bb ], [ %iv.next, %bb7 ]

[DA] remove wrap-around check from affine definition #116632

[DA] remove wrap-around check from affine definition #116632

Uh oh!

Conversation

sebpop commented Nov 18, 2024

Uh oh!

llvmbot commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sebpop Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sebpop Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Meinersbur commented Feb 7, 2025

Uh oh!

sebpop commented Feb 20, 2025

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Meinersbur left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Nov 18, 2024 •

edited

Loading

sebpop Nov 26, 2024 •

edited

Loading

sebpop Feb 6, 2025 •

edited

Loading