Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Comma operator in conditional_expression #187

Closed
2 tasks done
mingodad opened this issue Feb 8, 2024 · 6 comments · Fixed by #193
Closed
2 tasks done

bug: Comma operator in conditional_expression #187

mingodad opened this issue Feb 8, 2024 · 6 comments · Fixed by #193
Labels

Comments

@mingodad
Copy link

mingodad commented Feb 8, 2024

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-c

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter 0.20.9 (df1fe842eb08232f4119deba7aae973b7782530d)

Describe the bug

While testing this parser with sqlite3.c preprocessed I found this problem and also this related issue #31 and following it's solution I changed the grammar like shown bellow and got it to parse (probably the same apply to several other places).

---------------------------------- grammar.js ----------------------------------
index 2a9fe20..2f64cc9 100644
@@ -930,7 +930,7 @@ module.exports = grammar({
     conditional_expression: $ => prec.right(PREC.CONDITIONAL, seq(
       field('condition', $._expression),
       '?',
-      optional(field('consequence', $._expression)),
+      optional(field('consequence', choice($._expression, $.comma_expression))),
       ':',
       field('alternative', $._expression),
     )),

Steps To Reproduce/Bad Parse Tree

tree-sitter parse test-cond.c

Expected Behavior/Parse Tree

(translation_unit [0, 0] - [1, 0]
(expression_statement [0, 0] - [0, 20]
(assignment_expression [0, 0] - [0, 19]
left: (identifier [0, 0] - [0, 1])
right: (conditional_expression [0, 4] - [0, 19]
condition: (identifier [0, 4] - [0, 7])
consequence: (comma_expression [0, 10] - [0, 15]
left: (assignment_expression [0, 10] - [0, 13]
left: (identifier [0, 10] - [0, 11])
right: (number_literal [0, 12] - [0, 13]))
right: (number_literal [0, 14] - [0, 15]))
alternative: (number_literal [0, 18] - [0, 19])))))

Repro

//From sqlite3
//nHeader += (u8)(((u32)(nPayload)<(u32)0x80)?(*(&pCell[nHeader])=(unsigned char)(nPayload)),1: sqlite3PutVarint((&pCell[nHeader]),(nPayload)));
a = val ? b=3,1 : 0;
@mingodad mingodad added the bug label Feb 8, 2024
@mingodad
Copy link
Author

mingodad commented Feb 8, 2024

Also to be able to parse preprocessed sqlite3.c we need this #183 (comment) and also something like shown bellow for when using gcc to preprocess:

---------------------------------- grammar.js ----------------------------------
index 2a9fe20..28c2e73 100644
@@ -743,6 +743,7 @@ module.exports = grammar({
       $.while_statement,
       $.for_statement,
       $.return_statement,
+      $.fallthrough_statement,
       $.break_statement,
       $.continue_statement,
       $.goto_statement,
@@ -852,6 +853,10 @@ module.exports = grammar({
       ';',
     ),
 
+    fallthrough_statement: _ => seq(
+      '__attribute__((fallthrough))', ';',
+    ),
+
     break_statement: _ => seq(
       'break', ';',
     ),

@mingodad
Copy link
Author

mingodad commented Feb 8, 2024

With all the above changes the remaining problem to parse preprocessed sqlite3.c is this:

extern int _IO_vfprintf (_IO_FILE *__restrict, const char *__restrict,
    __gnuc_va_list);

And it seems that is something related with __restrict because if we replace __restrict by restrict then it parses:

int vs(const char *__restrict, va_list);
[translation_unit](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [1, 0]
  [declaration](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [0, 40]
    type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [0, 3]
    declarator: [function_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [0, 4] - [0, 39]
      declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 4] - [0, 6]
      parameters: [parameter_list](https://tree-sitter.github.io/tree-sitter/playground#) [0, 6] - [0, 39]
        [parameter_declaration](https://tree-sitter.github.io/tree-sitter/playground#) [0, 7] - [0, 38]
          [type_qualifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 7] - [0, 12]
          type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [0, 13] - [0, 17]
          declarator: [pointer_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [0, 18] - [0, 38]
            [ms_pointer_modifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 19] - [0, 29]
              [ms_restrict_modifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 19] - [0, 29]
            [ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [0, 29] - [0, 30]
            declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 31] - [0, 38]
int vs(const char *restrict, va_list);
[translation_unit](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [1, 0]
  [declaration](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [0, 38]
    type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [0, 3]
    declarator: [function_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [0, 4] - [0, 37]
      declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 4] - [0, 6]
      parameters: [parameter_list](https://tree-sitter.github.io/tree-sitter/playground#) [0, 6] - [0, 37]
        [parameter_declaration](https://tree-sitter.github.io/tree-sitter/playground#) [0, 7] - [0, 27]
          [type_qualifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 7] - [0, 12]
          type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [0, 13] - [0, 17]
          declarator: [abstract_pointer_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [0, 18] - [0, 27]
            [type_qualifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 19] - [0, 27]
        [parameter_declaration](https://tree-sitter.github.io/tree-sitter/playground#) [0, 29] - [0, 36]
          type: [type_identifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 29] - [0, 36]

Grammar:

...
    ms_restrict_modifier: _ => '__restrict',

    ms_unsigned_ptr_modifier: _ => '__uptr',

    ms_signed_ptr_modifier: _ => '__sptr',

    ms_unaligned_ptr_modifier: _ => choice('_unaligned', '__unaligned'),

    ms_pointer_modifier: $ => choice(
      $.ms_unaligned_ptr_modifier,
      $.ms_restrict_modifier,
      $.ms_unsigned_ptr_modifier,
      $.ms_signed_ptr_modifier,
    ),
...
    type_qualifier: _ => choice(
      'const',
      'constexpr',
      'volatile',
      'restrict',
      '__restrict__',
      '__extension__',
      '_Atomic',
      '_Noreturn',
      'noreturn',
    ),
...

@mingodad
Copy link
Author

mingodad commented Feb 8, 2024

Also the preproc_line is missing all of the above was preprocessed with -P to omit emit line info.
This parser doesn't recognize preproc_line

# 40 "/usr/lib/gcc/x86_64-linux-gnu/9/include/stdarg.h" 3 4
typedef __builtin_va_list __gnuc_va_list;
# 99 "/usr/lib/gcc/x86_64-linux-gnu/9/include/stdarg.h" 3 4
typedef __gnuc_va_list va_list;
# 346 "sqlite3.c" 2
# 495 "sqlite3.c"

@mingodad
Copy link
Author

I found the problem with __restrict, it's a missing repeat($.ms_pointer_modifier), in abstract_pointer_declarator see diff bellow.

---------------------------------- grammar.js ----------------------------------
index 2a9fe20..4b1a691 100644
@@ -441,6 +441,7 @@ module.exports = grammar({
       field('declarator', $._type_declarator),
     ))),
     abstract_pointer_declarator: $ => prec.dynamic(1, prec.right(seq('*',
+      repeat($.ms_pointer_modifier),
       repeat($.type_qualifier),
       field('declarator', optional($._abstract_declarator)),
     ))),

@mingodad
Copy link
Author

Also adding a rule for linemark allow it to parse preprocessed files with linemarks (see https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html):

---------------------------------- grammar.js ----------------------------------
index 2a9fe20..5de719e 100644
@@ -40,6 +40,7 @@ module.exports = grammar({
   extras: $ => [
     /\s|\\\r?\n/,
     $.comment,
+    $.preproc_linemark,
   ],
 
   inline: $ => [
@@ -110,6 +111,32 @@ module.exports = grammar({
     ),
 
     // Preprocesser
+
+    preproc_linemark: $ =>
+       /# \d+ "(\\.|[^\\"\r\n\\])+"( [1234]){0,4}\r?\n/
+    ,
+
     preproc_include: $ => seq(
       preprocessor('include'),

@amaanq amaanq mentioned this issue Feb 16, 2024
@amaanq
Copy link
Member

amaanq commented Feb 16, 2024

I don't think we want to consider preprocessed code files, that's just asking to handle too much with compiler intrinsics, line markers, etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants