Skip to content

OrderedChoice mathces only the first suitable rule, the next rules are ignored #133

@andr-dots

Description

@andr-dots

The OrderedChoice class can only match the first successful entry of the list of parsing expressions. But if the parent's next rule fails to match, the entire parent rule fails too.

The current realization of the OrderedChoice class is faster than any other stack-based, because it forces programmers to use And/Not operators and to rearrange their rules in such a way so that they don't need any kind of rule graph traversal. But some possible situations might still require traversing.

An example of PEG code that gives an error (arpeggio.NoMatch: Expected EOF at position (1, 2) => ' a*bc '.):

a_rule = 'a'
b_rule = 'b'
word = 'abc'
root_rule = (a_rule / b_rule / word) EOF

The input text 'abc' would always fail, though it's counterintuitive because, according to the documentation, it "succeeds if any of the given expressions matches at the current location".

As a solution within the current architecture, OrderedChoice can be refactored to return all the matches, so that the fallback mechanism could traverse through all these matches (and positions) and try each of them. But this solution makes the code much slower. The other solution is to create a new class for this purpose.

An another possible solution is to partially move the parsing logic from the parsing expression classes to the parser class and implement all the needed logic there by traversing all the possible branches of the parsing rules. The parser could return back to the first ordered choice with at least one unchecked choice left. This way the speed won't be affected too much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions