Skip to content

Commit 7643bd6

Browse files
fhahntstellar
authored andcommitted
[TBAA] Don't emit pointer-tbaa for void pointers. (llvm#122116)
While there are no special rules in the standards regarding void pointers and strict aliasing, emitting distinct tags for void pointers break some common idioms and there is no good alternative to re-write the code without strict-aliasing violations. An example is to count the entries in an array of pointers: int count_elements(void * values) { void **seq = values; int count; for (count = 0; seq && seq[count]; count++); return count; } https://clang.godbolt.org/z/8dTv51v8W An example in the wild is from llvm#119099 This patch avoids emitting distinct tags for void pointers, to avoid those idioms causing mis-compiles for now. Fixes llvm#119099. Fixes llvm#122537. PR: llvm#122116 (cherry picked from commit 77d3f8a)
1 parent 5707853 commit 7643bd6

File tree

5 files changed

+105
-41
lines changed

5 files changed

+105
-41
lines changed

Diff for: clang/docs/UsersManual.rst

+79-9
Original file line numberDiff line numberDiff line change
@@ -2489,6 +2489,82 @@ are listed below.
24892489
24902490
$ clang -fuse-ld=lld -Oz -Wl,--icf=safe -fcodegen-data-use code.cc
24912491
2492+
.. _strict_aliasing:
2493+
2494+
Strict Aliasing
2495+
---------------
2496+
2497+
The C and C++ standards require accesses to objects in memory to use l-values of
2498+
an appropriate type for the object. This is called *strict aliasing* or
2499+
*type-based alias analysis*. Strict aliasing enhances a variety of powerful
2500+
memory optimizations, including reordering, combining, and eliminating memory
2501+
accesses. These optimizations can lead to unexpected behavior in code that
2502+
violates the strict aliasing rules. For example:
2503+
2504+
.. code-block:: c++
2505+
2506+
void advance(size_t *index, double *data) {
2507+
double value = data[*index];
2508+
/* Clang may assume that this store does not change the contents of `data`. */
2509+
*index += 1;
2510+
/* Clang may assume that this store does not change the contents of `index`. */
2511+
data[*index] = value;
2512+
/* Either of these facts may create significant optimization opportunities
2513+
if Clang is able to inline this function. */
2514+
}
2515+
2516+
Strict aliasing can be explicitly enabled with ``-fstrict-aliasing`` and
2517+
disabled with ``-fno-strict-aliasing``. ``clang-cl`` defaults to
2518+
``-fno-strict-aliasing``; see . Otherwise, Clang defaults to ``-fstrict-aliasing``.
2519+
2520+
C and C++ specify slightly different rules for strict aliasing. To improve
2521+
language interoperability, Clang allows two types to alias if either language
2522+
would permit it. This includes applying the C++ similar types rule to C,
2523+
allowing ``int **`` to alias ``int const * const *``. Clang also relaxes the
2524+
standard aliasing rules in the following ways:
2525+
2526+
* All integer types of the same size are permitted to alias each other,
2527+
including signed and unsigned types.
2528+
* ``void*`` is permitted to alias any pointer type, ``void**`` is permitted to
2529+
alias any pointer to pointer type, and so on.
2530+
2531+
Code which violates strict aliasing has undefined behavior. A program that
2532+
works in one version of Clang may not work in another because of changes to the
2533+
optimizer. Clang provides a :doc:`TypeSanitizer` to help detect
2534+
violations of the strict aliasing rules, but it is currently still experimental.
2535+
Code that is known to violate strict aliasing should generally be built with
2536+
``-fno-strict-aliasing`` if the violation cannot be fixed.
2537+
2538+
Clang supports several ways to fix a violation of strict aliasing:
2539+
2540+
* L-values of the character types ``char`` and ``unsigned char`` (as well as
2541+
other types, depending on the standard) are permitted to access objects of
2542+
any type.
2543+
2544+
* Library functions such as ``memcpy`` and ``memset`` are specified as treating
2545+
memory as characters and therefore are not limited by strict aliasing. If a
2546+
value of one type must be reinterpreted as another (e.g. to read the bits of a
2547+
floating-point number), use ``memcpy`` to copy the representation to an object
2548+
of the destination type. This has no overhead over a direct l-value access
2549+
because Clang should reliably optimize calls to these functions to use simple
2550+
loads and stores when they are used with small constant sizes.
2551+
2552+
* The attribute ``may_alias`` can be added to a ``typedef`` to give l-values of
2553+
that type the same aliasing power as the character types.
2554+
2555+
Clang makes a best effort to avoid obvious miscompilations from strict aliasing
2556+
by only considering type information when it cannot prove that two accesses must
2557+
refer to the same memory. However, it is not recommended that programmers
2558+
intentionally rely on this instead of using one of the solutions above because
2559+
it is too easy for the compiler's analysis to be blocked in surprising ways.
2560+
2561+
In Clang 20, Clang strengthened its implementation of strict aliasing for
2562+
accesses of pointer type. Previously, all accesses of pointer type were
2563+
permitted to alias each other, but Clang now distinguishes different pointers
2564+
by their pointee type, except as limited by the relaxations around qualifiers
2565+
and ``void*`` described above. The previous behavior of treating all pointers as
2566+
aliasing can be restored using ``-fno-pointer-tbaa``.
2567+
24922568
Profile Guided Optimization
24932569
---------------------------
24942570

@@ -5272,12 +5348,6 @@ The Visual C++ Toolset has a slightly more elaborate mechanism for detection.
52725348
Restrictions and Limitations compared to Clang
52735349
----------------------------------------------
52745350

5275-
Strict Aliasing
5276-
^^^^^^^^^^^^^^^
5277-
5278-
Strict aliasing (TBAA) is always off by default in clang-cl. Whereas in clang,
5279-
strict aliasing is turned on by default for all optimization levels.
5280-
5281-
To enable LLVM optimizations based on strict aliasing rules (e.g., optimizations
5282-
based on type of expressions in C/C++), user will need to explicitly pass
5283-
`-fstrict-aliasing` to clang-cl.
5351+
Strict aliasing (TBAA) is always off by default in clang-cl whereas in clang,
5352+
strict aliasing is turned on by default for all optimization levels. For more
5353+
details, see :ref:`Strict aliasing <strict_aliasing>`.

Diff for: clang/lib/CodeGen/CodeGenTBAA.cpp

+8
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,14 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
226226
PtrDepth++;
227227
Ty = Ty->getPointeeType()->getBaseElementTypeUnsafe();
228228
} while (Ty->isPointerType());
229+
230+
// While there are no special rules in the standards regarding void pointers
231+
// and strict aliasing, emitting distinct tags for void pointers break some
232+
// common idioms and there is no good alternative to re-write the code
233+
// without strict-aliasing violations.
234+
if (Ty->isVoidType())
235+
return AnyPtr;
236+
229237
assert(!isa<VariableArrayType>(Ty));
230238
// When the underlying type is a builtin type, we compute the pointee type
231239
// string recursively, which is implicitly more forgiving than the standards

Diff for: clang/test/CodeGen/tbaa-pointers.c

+3-10
Original file line numberDiff line numberDiff line change
@@ -208,12 +208,9 @@ int void_ptrs(void **ptr) {
208208
// COMMON-LABEL: define i32 @void_ptrs(
209209
// COMMON-SAME: ptr noundef [[PTRA:%.+]])
210210
// COMMON: [[PTR_ADDR:%.+]] = alloca ptr, align 8
211-
// DISABLE-NEXT: store ptr [[PTRA]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
212-
// DISABLE-NEXT: [[L0:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
213-
// DISABLE-NEXT: [[L1:%.+]] = load ptr, ptr [[L0]], align 8, !tbaa [[ANYPTR]]
214-
// DEFAULT-NEXT: store ptr [[PTRA]], ptr [[PTR_ADDR]], align 8, !tbaa [[P2VOID:!.+]]
215-
// DEFAULT-NEXT: [[L0:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[P2VOID]]
216-
// DEFAULT-NEXT: [[L1:%.+]] = load ptr, ptr [[L0]], align 8, !tbaa [[P1VOID:!.+]]
211+
// COMMON-NEXT: store ptr [[PTRA]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
212+
// COMMON-NEXT: [[L0:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
213+
// COMMON-NEXT: [[L1:%.+]] = load ptr, ptr [[L0]], align 8, !tbaa [[ANYPTR]]
217214
// COMMON-NEXT: [[BOOL:%.+]] = icmp ne ptr [[L1]], null
218215
// COMMON-NEXT: [[BOOL_EXT:%.+]] = zext i1 [[BOOL]] to i64
219216
// COMMON-NEXT: [[COND:%.+]] = select i1 [[BOOL]], i32 0, i32 1
@@ -254,7 +251,3 @@ int void_ptrs(void **ptr) {
254251
// COMMON: [[INT_TAG]] = !{[[INT_TY:!.+]], [[INT_TY]], i64 0}
255252
// COMMON: [[INT_TY]] = !{!"int", [[CHAR]], i64 0}
256253
// DEFAULT: [[ANYPTR]] = !{[[ANY_POINTER]], [[ANY_POINTER]], i64 0}
257-
// DEFAULT: [[P2VOID]] = !{[[P2VOID_TY:!.+]], [[P2VOID_TY]], i64 0}
258-
// DEFAULT: [[P2VOID_TY]] = !{!"p2 void", [[ANY_POINTER]], i64 0}
259-
// DEFAULT: [[P1VOID]] = !{[[P1VOID_TY:!.+]], [[P1VOID_TY]], i64 0}
260-
// DEFAULT: [[P1VOID_TY]] = !{!"p1 void", [[ANY_POINTER]], i64 0}

Diff for: clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl

+12-13
Original file line numberDiff line numberDiff line change
@@ -651,7 +651,7 @@ kernel void test_target_features_kernel(global int *i) {
651651
//
652652
// GFX900: Function Attrs: convergent nounwind
653653
// GFX900-LABEL: define {{[^@]+}}@__test_block_invoke_3_kernel
654-
// GFX900-SAME: (<{ i32, i32, ptr, ptr addrspace(1), ptr addrspace(1), i64, i8 }> [[TMP0:%.*]], ptr addrspace(3) [[TMP1:%.*]]) #[[ATTR6]] !kernel_arg_addr_space [[META28:![0-9]+]] !kernel_arg_access_qual [[META29:![0-9]+]] !kernel_arg_type [[META30:![0-9]+]] !kernel_arg_base_type [[META30]] !kernel_arg_type_qual [[META31:![0-9]+]] {
654+
// GFX900-SAME: (<{ i32, i32, ptr, ptr addrspace(1), ptr addrspace(1), i64, i8 }> [[TMP0:%.*]], ptr addrspace(3) [[TMP1:%.*]]) #[[ATTR6]] !kernel_arg_addr_space [[META27:![0-9]+]] !kernel_arg_access_qual [[META28:![0-9]+]] !kernel_arg_type [[META29:![0-9]+]] !kernel_arg_base_type [[META29]] !kernel_arg_type_qual [[META30:![0-9]+]] {
655655
// GFX900-NEXT: entry:
656656
// GFX900-NEXT: [[TMP2:%.*]] = alloca <{ i32, i32, ptr, ptr addrspace(1), ptr addrspace(1), i64, i8 }>, align 8, addrspace(5)
657657
// GFX900-NEXT: store <{ i32, i32, ptr, ptr addrspace(1), ptr addrspace(1), i64, i8 }> [[TMP0]], ptr addrspace(5) [[TMP2]], align 8
@@ -688,7 +688,7 @@ kernel void test_target_features_kernel(global int *i) {
688688
//
689689
// GFX900: Function Attrs: convergent norecurse nounwind
690690
// GFX900-LABEL: define {{[^@]+}}@test_target_features_kernel
691-
// GFX900-SAME: (ptr addrspace(1) noundef align 4 [[I:%.*]]) #[[ATTR2]] !kernel_arg_addr_space [[META32:![0-9]+]] !kernel_arg_access_qual [[META23]] !kernel_arg_type [[META33:![0-9]+]] !kernel_arg_base_type [[META33]] !kernel_arg_type_qual [[META25]] {
691+
// GFX900-SAME: (ptr addrspace(1) noundef align 4 [[I:%.*]]) #[[ATTR2]] !kernel_arg_addr_space [[META31:![0-9]+]] !kernel_arg_access_qual [[META23]] !kernel_arg_type [[META32:![0-9]+]] !kernel_arg_base_type [[META32]] !kernel_arg_type_qual [[META25]] {
692692
// GFX900-NEXT: entry:
693693
// GFX900-NEXT: [[I_ADDR:%.*]] = alloca ptr addrspace(1), align 8, addrspace(5)
694694
// GFX900-NEXT: [[DEFAULT_QUEUE:%.*]] = alloca ptr addrspace(1), align 8, addrspace(5)
@@ -700,7 +700,7 @@ kernel void test_target_features_kernel(global int *i) {
700700
// GFX900-NEXT: [[FLAGS_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[FLAGS]] to ptr
701701
// GFX900-NEXT: [[NDRANGE_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[NDRANGE]] to ptr
702702
// GFX900-NEXT: [[TMP_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[TMP]] to ptr
703-
// GFX900-NEXT: store ptr addrspace(1) [[I]], ptr [[I_ADDR_ASCAST]], align 8, !tbaa [[TBAA34:![0-9]+]]
703+
// GFX900-NEXT: store ptr addrspace(1) [[I]], ptr [[I_ADDR_ASCAST]], align 8, !tbaa [[TBAA33:![0-9]+]]
704704
// GFX900-NEXT: call void @llvm.lifetime.start.p5(i64 8, ptr addrspace(5) [[DEFAULT_QUEUE]]) #[[ATTR8]]
705705
// GFX900-NEXT: call void @llvm.lifetime.start.p5(i64 4, ptr addrspace(5) [[FLAGS]]) #[[ATTR8]]
706706
// GFX900-NEXT: store i32 0, ptr [[FLAGS_ASCAST]], align 4, !tbaa [[TBAA17]]
@@ -803,16 +803,15 @@ kernel void test_target_features_kernel(global int *i) {
803803
// GFX900: [[META23]] = !{!"none"}
804804
// GFX900: [[META24]] = !{!"__block_literal"}
805805
// GFX900: [[META25]] = !{!""}
806-
// GFX900: [[TBAA26]] = !{[[META27:![0-9]+]], [[META27]], i64 0}
807-
// GFX900: [[META27]] = !{!"p1 void", [[META9]], i64 0}
808-
// GFX900: [[META28]] = !{i32 0, i32 3}
809-
// GFX900: [[META29]] = !{!"none", !"none"}
810-
// GFX900: [[META30]] = !{!"__block_literal", !"void*"}
811-
// GFX900: [[META31]] = !{!"", !""}
812-
// GFX900: [[META32]] = !{i32 1}
813-
// GFX900: [[META33]] = !{!"int*"}
814-
// GFX900: [[TBAA34]] = !{[[META35:![0-9]+]], [[META35]], i64 0}
815-
// GFX900: [[META35]] = !{!"p1 int", [[META9]], i64 0}
806+
// GFX900: [[TBAA26]] = !{[[META9]], [[META9]], i64 0}
807+
// GFX900: [[META27]] = !{i32 0, i32 3}
808+
// GFX900: [[META28]] = !{!"none", !"none"}
809+
// GFX900: [[META29]] = !{!"__block_literal", !"void*"}
810+
// GFX900: [[META30]] = !{!"", !""}
811+
// GFX900: [[META31]] = !{i32 1}
812+
// GFX900: [[META32]] = !{!"int*"}
813+
// GFX900: [[TBAA33]] = !{[[META34:![0-9]+]], [[META34]], i64 0}
814+
// GFX900: [[META34]] = !{!"p1 int", [[META9]], i64 0}
816815
//.
817816
//// NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
818817
// CHECK: {{.*}}

Diff for: clang/unittests/CodeGen/TBAAMetadataTest.cpp

+3-9
Original file line numberDiff line numberDiff line change
@@ -117,15 +117,9 @@ TEST(TBAAMetadataTest, BasicTypes) {
117117
ASSERT_TRUE(I);
118118

119119
I = matchNext(I,
120-
MInstruction(Instruction::Store,
121-
MValType(PointerType::getUnqual(Compiler.Context)),
122-
MMTuple(
123-
MMTuple(
124-
MMString("p1 void"),
125-
AnyPtr,
126-
MConstInt(0)),
127-
MSameAs(0),
128-
MConstInt(0))));
120+
MInstruction(Instruction::Store,
121+
MValType(PointerType::getUnqual(Compiler.Context)),
122+
MMTuple(AnyPtr, MSameAs(0), MConstInt(0))));
129123
ASSERT_TRUE(I);
130124

131125
I = matchNext(I,

0 commit comments

Comments
 (0)