Skip to content

Grammar – Consistency & Cleanup 5 #1322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 14, 2025

Conversation

Nigel-Ecma
Copy link
Contributor

[Continues on from #1320 Grammar – Consistency & Cleanup 4]

This PR subsumes Treat stackalloc like array creation #1237 which should be closed if this one is accepted.

The targets this time are primary_no_array_creation_expression, primary_expression, element_access, null_conditional_element_access and pointer_element_access.

Existing Grammar

Don’t panic not all of this changes!

§12.8.1:

primary_expression
    : primary_no_array_creation_expression
    | array_creation_expression
    ;

primary_no_array_creation_expression
    : literal
    | ...
    | element_access
    | null_conditional_element_access
    | ...
    | pointer_element_access    // unsafe code support
    | stackalloc_expression
    ;

§12.8.12.1:

element_access
    : primary_no_array_creation_expression '[' argument_list ']'
    ;

§12.8.13:

null_conditional_element_access
    : primary_no_array_creation_expression '?' '[' argument_list ']'
      (null_forgiving_operator? dependent_access)*
    ;

§12.8.17.4:

array_creation_expression
    : 'new' non_array_type '[' expression_list ']' rank_specifier*
      array_initializer?
    | 'new' array_type array_initializer
    | 'new' rank_specifier array_initializer
    ;

§12.8.22:

stackalloc_expression
    : 'stackalloc' unmanaged_type '[' expression ']'
    | 'stackalloc' unmanaged_type? '[' constant_expression? ']' stackalloc_initializer
    ;

§23.6.4:

pointer_element_access
    : primary_no_array_creation_expression '[' expression ']'
    ;

Why does the primary_expression/primary_no_array_creation_expression dance exist?

The element_access, null_conditional_element_access & pointer_element_access alternatives of primary_no_array_creation_expression are the only alternatives which contain primary_no_array_creation_expression at the start of their definitions – i.e. they are mutually left recursive with it.

A number of the other alternatives (not shown above) of primary_no_array_creation_expression contain primary_expression at the start of their definitions – i.e. they form a longer mutually left recursive cycle back to primary_no_array_creation_expression via primary_expression.

Why was this done? The answer is givein in §12.8.1:

Primary expressions are divided between array_creation_expressions and primary_no_array_creation_expressions. Treating array_creation_expression in this way, rather than listing it along with the other simple expression forms, enables the grammar to disallow potentially confusing code such as

object o = new int[3][1];

which would otherwise be interpreted as

object o = (new int[3])[1];

Unfortunately this grammar arrangement to avoid the potentially confusing code is not correct

Looking at array_creation_expression you will see there are 3 alternatives but only the first one can end in ] and thus produce the potentially confusing code, and then only when there is not an optional array_initializer.

So the primary_expression/primary_no_array_creation_expression dance blocks the latter two, and the first when there is an array_initializer, none of which do not give rise to the potentially confusing code, oops.

In case you’re wondering why this hasn’t been spotted previously: probably the most used existing compiler already allows the cases when there is an array_initializer present, it only blocks the potentially confusing case.

Further down the rabbit hole…

Enter stackalloc_expression, the places where this could occur used to be quite restricted however it is now valid in more places and has become an alternative of primary_no_array_creation_expression. It can therefore give rise to the same potentially confusing situation, e.g. stackalloc int[3][1], and like array_creation_expression only via its first alternative.

To address this it has been proposed (#1237) to move it into array_creation_expression and thus join it in having too much excluded…

How to fix this?

If the current arrangement is kept it could be fixed by splitting the rules for array_creation_expression and stackalloc_expression into two, one for the potentially confusing case and one for the rest, and placing the former two into primary_expression and the latter two into primary_no_array_creation_expression… Confused? Even if you’re not this is plain messy!

Another alternative is to enforce the restriction on the potentially confusing combination using semantic rules in element_access, null_conditional_element_access & pointer_element_access, i.e. at the point where the issue can arise. And drop the primary_expression/primary_no_array_creation_expression dance.

Semantic rules are used in many other places in the Standard to augment grammar rules, so this solution is not new per se.

The PR

After that long winded preamble, this PR:

  • Merges the current primary_no_array_creation_expression into primary_expression. All existing references to the former are switched to the latter.

Note: This change does not remove MLR, that still exists, e.g. primary_expression -> post_increment_expression -> primary_expression, but it does simplify it by removing longer cycles.

  • Adds semantic restrictions to the descriptions of element_access, null_conditional_element_access & pointer_element_access to disallow only the potentially confusing code.
  • Updates the grammar checker framework to check these semantic restrictions and issue error messages if they are violated.
  • Adds a test sample to cover both valid & invalid combinations.

Easy… ;-)

Review Notes

To review the grammar & text changes to the Standard you only need to consider the changes to the *.md files in standard; the other files are machine-generated updates to parsing test samples describing the new expected parse for some samples.

If you wish to see visualisations the changes make to the parse you may download the changed *.svg files in the PR and the current versions of them in draft-v8 and view them side-by-side (multiple monitors helps!)

- Remove primary_array_no_creation and replace with semantic
  rules in element_access, null_conditional_element_access
  and pointer_member_access.
- Update sample reference files & tarball to v2.3.4
@Nigel-Ecma Nigel-Ecma added this to the C# 8.0 milestone May 6, 2025
@Nigel-Ecma Nigel-Ecma requested review from jskeet, BillWagner and gafter May 6, 2025 23:27
@Nigel-Ecma Nigel-Ecma self-assigned this May 6, 2025
Copy link
Member

@BillWagner BillWagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM.

@BillWagner BillWagner added the meeting: discuss This issue should be discussed at the next TC49-TG2 meeting label May 8, 2025
(null_forgiving_operator? dependent_access)*
;
```

The *argument_list* of an *null_conditional_element_access* is not allowed to contain `out` or `ref` arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall not? May not? (May not in the paragraph below.)


#### 12.8.12.2 Array access

For an array access, the *primary_no_array_creation_expression* of the *element_access* shall be a value of an *array_type*. Furthermore, the *argument_list* of an array access is not allowed to contain named arguments. The number of expressions in the *argument_list* shall be the same as the rank of the *array_type*, and each expression shall be of type `int`, `uint`, `long`, or `ulong,` or shall be implicitly convertible to one or more of these types.
For an array access, the *primary_expression* of the *element_access* shall be a value of an *array_type*. Furthermore, the *argument_list* of an array access is not allowed to contain named arguments. The number of expressions in the *argument_list* shall be the same as the rank of the *array_type*, and each expression shall be of type `int`, `uint`, `long`, or `ulong,` or shall be implicitly convertible to one or more of these types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"not allowed" again

@BillWagner BillWagner merged commit 19166a6 into dotnet:draft-v8 May 14, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meeting: discuss This issue should be discussed at the next TC49-TG2 meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants