-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][Sema] Fix type of an statement expression ending with an atomic type #119711
[clang][Sema] Fix type of an statement expression ending with an atomic type #119711
Conversation
@llvm/pr-subscribers-clang Author: Alejandro Álvarez Ayllón (alejandro-alvarez-sonarsource) ChangesWhen a statement expression's last statement is an atomic variable, GCC and Clang disagree on the type of the expression. This can be made apparent using _Atomic int a = 0;
typeof(({a;})) x = "0";
Due to the type of the statement expression being deduced to be atomic, we end with three implicit casts inside the
In some situations, this can end on an assertion inside With this patch, we now have two implicit casts, since the type of the statement expression is deduced to be non-atomic:
This is consistent with the C standard (6.7.2.4, p4) > The properties associated with atomic types are meaningful only for expressions that are lvalues. But a statement expression is an rvalue.
Fixes #106576 Full diff: https://github.com/llvm/llvm-project/pull/119711.diff 2 Files Affected:
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 20bf6f7f6f28ff..45ae97807c2032 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -15861,10 +15861,17 @@ ExprResult Sema::ActOnStmtExprResult(ExprResult ER) {
if (Cast && Cast->getCastKind() == CK_ARCConsumeObject)
return Cast->getSubExpr();
+ auto Ty = E->getType().getUnqualifiedType();
+
+ // If the type is an atomic, the statement type is the underlying type.
+ if (const AtomicType *AT = Ty->getAs<AtomicType>()) {
+ Ty = AT->getValueType().getUnqualifiedType();
+ return PerformImplicitConversion(E, Ty, AssignmentAction::Casting);
+ }
+
// FIXME: Provide a better location for the initialization.
return PerformCopyInitialization(
- InitializedEntity::InitializeStmtExprResult(
- E->getBeginLoc(), E->getType().getUnqualifiedType()),
+ InitializedEntity::InitializeStmtExprResult(E->getBeginLoc(), Ty),
SourceLocation(), E);
}
diff --git a/clang/test/Sema/gh106576.c b/clang/test/Sema/gh106576.c
new file mode 100644
index 00000000000000..792977dea14132
--- /dev/null
+++ b/clang/test/Sema/gh106576.c
@@ -0,0 +1,16 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+
+typedef _Atomic char atomic_char;
+
+typedef _Atomic char atomic_char;
+
+atomic_char counter;
+
+char load_plus_one(void) {
+ return ({counter;}) + 1; // no crash
+}
+
+char type_of_stmt_expr(void) {
+ typeof(({counter;})) y = ""; // expected-error-re {{incompatible pointer to integer conversion initializing 'typeof (({{{.*}}}))' (aka 'char') with an expression of type 'char[1]'}}
+ return y;
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will eventually need a release note.
Ping? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks about right to me but I would like another set of eyes.
clang/lib/Sema/SemaExpr.cpp
Outdated
@@ -15919,10 +15919,17 @@ ExprResult Sema::ActOnStmtExprResult(ExprResult ER) { | |||
if (Cast && Cast->getCastKind() == CK_ARCConsumeObject) | |||
return Cast->getSubExpr(); | |||
|
|||
auto Ty = E->getType().getUnqualifiedType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm confused by is actually a few lines farther up. Why do we not want to do an lvalue-to-rvalue conversion when the result of the statement expression is an rvalue? That conversion would strip all the qualifiers, including the atomic ones, which seems like exactly the behavior we want.
That comment comes from 34376a68
CC @rjmccall in case you have some recollection here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's because we're setting up for a PerformCopyInitialization below, so the l-value conversion would be redundant (or possibly slightly wrong for some C++ class types?). Certainly could be clearer in the comment.
As far as this patch goes, my longstanding thought is that _Atomic
is clearly a qualifier. I know the C standard doesn't agree, but the C standard is wrong, the same way that it's wrong when it says that function references aren't l-values; sometimes the formal system has an obvious truth that doesn't align with the glossary. In this case, the language treats _Atomic
as a qualifier for almost every purpose, and the compiler generally reflects that: most of the places that uniformly look through qualifiers also look through _Atomic
, and most of the places that need to look specifically for _Atomic
also need to look for certain qualifiers.
So my suggestion is that common functions like getUnqualifiedType()
should just be looking through _Atomic
, which would have fixed this bug before it happened. If there's a need for a narrower API that only looks through non-_Atomic
qualifiers (or perhaps some subset of "non-ABI-affecting" qualifiers?), we can add that with a nice name that suggests when it should be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my suggestion is that common functions like getUnqualifiedType() should just be looking through _Atomic, which would have fixed this bug before it happened.
While I understand the reasoning, I am wary of touching a method that is used (roughly counted) over 400 times. Unfortunately, I do not currently have the bandwith to commit into carefully combing all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as this patch goes, my longstanding thought is that _Atomic is clearly a qualifier.
Errrr, I don't agree. I think _Atomic
is a type specifier. _Atomic int
and int
are not the same type (heck, they can even have different sizes and alignments!). It's somewhat like a qualifier in that it impacts how you access the underlying object (so in that regard, it is a bit like volatile int
and int
), but I think the fundamental thing is that you rarely want int
semantics directly when the type is spelled _Atomic int
.
So my suggestion is that common functions like getUnqualifiedType() should just be looking through _Atomic, which would have fixed this bug before it happened. If there's a need for a narrower API that only looks through non-_Atomic qualifiers (or perhaps some subset of "non-ABI-affecting" qualifiers?), we can add that with a nice name that suggests when it should be used.
On the one hand, this specific bug has bitten us so many times I want to agree with your proposed approach.
On the other hand, that's going to be a massive undertaking that's pretty risky. We have times when we definitely do not want the qualified type to drop the atomic qualifier. For example, when merging function types, we call getUnqualifiedType()
to determine whether two parameters are compatible. void func(const int i);
and void func(int i);
are compatible redeclarations, while void func(_Atomic int i);
and void func(int i);
are not. There are other times when we do want the atomic qualifier dropped. It's a frustrating mess and I'm certain we have numerous bugs in this area.
So the question to me becomes: do we fix this narrow issue and kick the can down the road? Or do we try to find someone willing to undergo the more significant cleanup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as this patch goes, my longstanding thought is that _Atomic is clearly a qualifier.
Errrr, I don't agree. I think
_Atomic
is a type specifier._Atomic int
andint
are not the same type
const int
and int
are also not the same type.
(heck, they can even have different sizes and alignments!).
Size and alignment is the (back-justified) reason we draw the line where we currently do. It is not a particularly sensible way to do it. Most of our existing qualifiers affect code generation and ABI, just in ways that happen to not affect size and alignment. Several qualifiers affect the ABI rules for containing aggregates, for example.
It's somewhat like a qualifier in that it impacts how you access the underlying object (so in that regard, it is a bit like
volatile int
andint
), but I think the fundamental thing is that you rarely wantint
semantics directly when the type is spelled_Atomic int
.
I'm not quite sure what you're saying here. On the implementation level, the compiler can never ignore qualifiers when performing an access to an object; the whole point of qualifiers is that they affect the semantics of accesses. On the language design level, I agree with you that programmers should never perform normal accesses to _Atomic
objects instead of using the intrinsics. However, that is an argument that it was an unfortunate language design decision that _Atomic
behaves like a qualifier. In the language that we've actually got, _Atomic
has most of the characteristics of qualifiers, and there is no benefit to the compiler implementation in pretending otherwise.
The common characteristics of qualifiers:
- They convey an aspect of how a value is stored or accessed other than the abstract value itself.
- They are typically disallowed or silently dropped at the top level in type contexts that primarily describe an abstract value rather than storage, such as casts, return values, and parameters (when determining the function type).
- R-value expressions represent an abstract value and are (for the most part) unqualified at the top level. Using a qualified l-value in a context that requires an r-value drops all qualifiers as part of the conversion to an r-value. The qualifiers typically affect the code-generation of the load.
- Assignment consumes an abstract value and therefore expects an r-value on the RHS, which (per above) is of unqualified type. The qualifier typically affects code-generation of the store.
- The qualifier (if it can apply to structs and unions at all) propagates along member accesses, such that the type of
x.member
has all the top-level qualifiers of bothx
and the member.
Most of these apply to _Atomic
. The most notable exception is the last, which C makes UB. But it is not uncommon for qualifiers to have tweaks to various of these rules, like how C++ allows qualified return values of class type for some reason.
The subtyping rules for CV qualifiers — the ability to e.g. convert an int**
to const int * const *
— do not generally work for other qualifiers. For example, even just considering semi-standardized qualifiers, you clearly cannot add an address space in a nested position this way. I believe you have to understand this as a useful special case for the CV qualifiers. (It really should not work for restrict
, although I can't remember what rule we actually implement.)
So my suggestion is that common functions like getUnqualifiedType() should just be looking through _Atomic, which would have fixed this bug before it happened. If there's a need for a narrower API that only looks through non-_Atomic qualifiers (or perhaps some subset of "non-ABI-affecting" qualifiers?), we can add that with a nice name that suggests when it should be used.
On the one hand, this specific bug has bitten us so many times I want to agree with your proposed approach.
On the other hand, that's going to be a massive undertaking that's pretty risky.
Yeah, I don't disagree with this, and I'm not trying to force the problem to be solved immediately. I would like to try to get consensus on the right thing to do, though.
We have times when we definitely do not want the qualified type to drop the atomic qualifier. For example, when merging function types, we call
getUnqualifiedType()
to determine whether two parameters are compatible.void func(const int i);
andvoid func(int i);
are compatible redeclarations, whilevoid func(_Atomic int i);
andvoid func(int i);
are not. There are other times when we do want the atomic qualifier dropped. It's a frustrating mess and I'm certain we have numerous bugs in this area.
In this case, I have to wonder if the standard is in the same place Clang is — if this was really an intentional allowance for _Atomic T
as a parameter in function types or just an oversight. It's hard to imagine what an atomic argument could possibly mean. Perhaps it was an oversight that they no longer feel comfortable fixing.
So the question to me becomes: do we fix this narrow issue and kick the can down the road? Or do we try to find someone willing to undergo the more significant cleanup?
If we agree that the compiler probably ought to be defaulting to treating atomic types as qualifiers, I think the path is:
- Introduce a function that looks through
_Atomic
as well as the other qualifiers. - Make a best effort to switch existing call sites of
getUnqualifiedType()
that immediately also look through_Atomic
to instead use the common function. Probably ought to be the same patch as (1). - Rework this patch to just use the new function.
- Introduce a function that explicitly does not look through
_Atomic
. - Gradually go through the remaining
getUnqualifiedType()
uses and figure out which of the explicit functions they should use, adding tests as necessary. - When that's done, decide if we've got compelling evidence that
getUnqualifiedType()
should just default to looking through_Atomic
.
I think asking Alejandro to do steps 1-3 as part of this bug fix is reasonable, and then the rest can be an open project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as this patch goes, my longstanding thought is that _Atomic is clearly a qualifier.
Errrr, I don't agree. I think
_Atomic
is a type specifier._Atomic int
andint
are not the same type
const int
andint
are also not the same type.Maybe I'm being imprecise, sorry for that.
const int
andint
are both "morally" of typeint
in that the allowed accesses to either type will give the same code generation and behavior. There's no difference between read accesses toint
andconst int
: https://godbolt.org/z/GzhKWr8W6
That is mostly a special case of const
. Even volatile
requires code-gen differences, at least when generating a high-level IR subject to optimization.
(heck, they can even have different sizes and alignments!).
Size and alignment is the (back-justified) reason we draw the line where we currently do. It is not a particularly sensible way to do it. Most of our existing qualifiers affect code generation and ABI, just in ways that happen to not affect size and alignment. Several qualifiers affect the ABI rules for containing aggregates, for example.
Do you have an example of impacting the ABI rules in C?
If you mean qualifiers officially blessed in any way by WG14, I believe that's just const
, volatile
, restrict
, and the address-space qualifiers from TR 18037. Address space qualifiers do affect ABI, but they don't specifically affect struct layout because they are not allowed on struct members.
Loading an int to get its value is a different operation than loading an atomic int to get its value; the latter may require entirely different instruction selection. Probably
int
is a bad example, perhapsstruct
values makes the distinction more clear?
Needing different instruction selection is common for extended qualifiers. Even among the official qualifiers, address spaces typically require different instruction selection.
Another notable exception is that
_Atomic
is not silently dropped at the top level in type for return values and parameters when determining the function type, as mentioned above.
It actually is dropped for return types; it's just parameters that are different.
In this case, I have to wonder if the standard is in the same place Clang is — if this was really an intentional allowance for
_Atomic T
as a parameter in function types or just an oversight. It's hard to imagine what an atomic argument could possibly mean. Perhaps it was an oversight that they no longer feel comfortable fixing.Doesn't it matter in the same way it matters for
const
parameters: it impacts the accesses to the parameter within the function body itself?
Sure, but it is not in any way necessary for this to be part of the function type, just as it isn't part of the function type for const
. The parameter is an independent object and has no relation to anything in the caller, so making the atomic-ness part of the signature is pointless.
If we agree that the compiler probably ought to be defaulting to treating atomic types as qualifiers, I think the path is:
1. Introduce a function that looks through `_Atomic` as well as the other qualifiers. 2. Make a best effort to switch existing call sites of `getUnqualifiedType()` that immediately also look through `_Atomic` to instead use the common function. Probably ought to be the same patch as (1).
I think we've already done this though. We have
getAtomicUnqualifiedType()
which we use when we want to get a non-atomic, unqualified type.3. Rework this patch to just use the new function. 4. Introduce a function that explicitly does not look through `_Atomic`.
Which is
getUnqualifiedType()
currently.
Right, it would have the same behavior as getUnqualifiedType()
(and presumably would be implemented as a call to it). Making it a separate function lets you keep track of the fact that you considered a caller and deliberately did not want it to look through _Atomic
, though.
I think asking Alejandro to do steps 1-3 as part of this bug fix is reasonable, and then the rest can be an open project.
So I think what I'm getting is: we should audit all existing calls to
getUnqualifiedType()
andgetAtomicUnqualifiedType()
to determine whether we're correctly handling atomics or not, and if there are more uses ofgetAtomicUnqualifiedType()
after correcting issues, then maybe we wantgetUnqualifiedType()
to look through atomics and renamegetAtomicUnqualifiedType()
to something else?
Right, with the tweak that the audit always switches away from getUnqualifiedType()
: each call either becomes getAtomicUnqualifiedType()
(I forgot that this already exists) or getUnqualifiedTypePreservingAtomic()
, and then the audit is done when there are no remaining calls to getUnqualifiedType()
. At that point, we decide if having a getUnqualifiedType()
is a good idea or if we should just force all callers to explicitly choose.
And as mentioned, there's a short-term subset of the plan that unblocks this fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And as mentioned, there's a short-term subset of the plan that unblocks this fix.
Thanks for the feedback and for the enlightening discussion. I will go in this direction, but it will take me a while, since I have quite a few things on the backlog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, with the tweak that the audit always switches away from getUnqualifiedType(): each call either becomes getAtomicUnqualifiedType() (I forgot that this already exists) or getUnqualifiedTypePreservingAtomic(), and then the audit is done when there are no remaining calls to getUnqualifiedType(). At that point, we decide if having a getUnqualifiedType() is a good idea or if we should just force all callers to explicitly choose.
Okay, this is a plan I can get behind -- thank you for the good discussion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And as mentioned, there's a short-term subset of the plan that unblocks this fix.
Thanks for the feedback and for the enlightening discussion. I will go in this direction, but it will take me a while, since I have quite a few things on the backlog.
Before you get too far into anything else, could you try just changing the call to getUnqualifiedType()
to getAtomicUnqualifiedType()
in the original code and seeing if that fixes the problem? That should be fine for the local, short-term fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before you get too far into anything else, could you try just changing the call to getUnqualifiedType() to getAtomicUnqualifiedType() in the original code and seeing if that fixes the problem? That should be fine for the local, short-term fix.
It fixes it, yes, thanks!
(Sorry for the force-push, I saw some seemingly unrelated tests were broken and I wanted to make sure it wasn't due to this, after un-merging main
all tests are green)
75b12bb
to
2cfef98
Compare
clang/test/Sema/gh106576.c
Outdated
|
||
typedef _Atomic char atomic_char; | ||
|
||
typedef _Atomic char atomic_char; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably the redundant typedef is not a critical part of this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't, I have removed the duplicated line.
Co-authored-by: John McCall <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM.
Thanks! Do you mind doing the merge, since I can't? |
…ic type (llvm#119711) When a statement expression's last statement is an atomic variable, GCC and Clang disagree on the type of the expression. This can be made apparent using `typeof` and forcing a diagnostic message: ```cpp _Atomic int a = 0; typeof(({a;})) x = "0"; ``` * GCC complains about initializing `int` with `char*` * Clang complains about initializing `_Atomic(int)` with a `char[2]` Due to the type of the statement expression being deduced to be atomic, we end with three implicit casts inside the `StmtExpr` on the AST: * `LValueToRValue` -> `AtomicToNonAtomic` -> `NonAtomicToAtomic` In some situations, this can end on an assertion inside `IntExprEvaluator`, as reported in llvm#106576. With this patch, we now have two implicit casts, since the type of the statement expression is deduced to be non-atomic: * `LValueToRValue` -> `AtomicToNonAtomic` This is consistent with the C standard (6.7.2.4, p4) > The properties associated with atomic types are meaningful only for expressions that are lvalues. But a statement expression is an rvalue. `IntExprEvaluator` assumptions are now satisfied and there is no assertion error. Additionally, the `typeof` trick mentioned above shows that the type is consistently deduced between GCC and Clang. Fixes llvm#106576 --------- Co-authored-by: John McCall <[email protected]>
When a statement expression's last statement is an atomic variable, GCC and Clang disagree on the type of the expression. This can be made apparent using
typeof
and forcing a diagnostic message:int
withchar*
_Atomic(int)
with achar[2]
Due to the type of the statement expression being deduced to be atomic, we end with three implicit casts inside the
StmtExpr
on the AST:LValueToRValue
->AtomicToNonAtomic
->NonAtomicToAtomic
In some situations, this can end on an assertion inside
IntExprEvaluator
, as reported in #106576.With this patch, we now have two implicit casts, since the type of the statement expression is deduced to be non-atomic:
LValueToRValue
->AtomicToNonAtomic
This is consistent with the C standard (6.7.2.4, p4)
But a statement expression is an rvalue.
IntExprEvaluator
assumptions are now satisfied and there is no assertion error.Additionally, the
typeof
trick mentioned above shows that the type is consistently deduced between GCC and Clang.Fixes #106576