Skip to content

Latest commit

 

History

History
94 lines (78 loc) · 3.88 KB

README.md

File metadata and controls

94 lines (78 loc) · 3.88 KB

This parser is a standard LR parser based on Menhir.

Tokens, token families and their precedence is described in lexer_config.ml. Out of that we generate:

The file parse.ml ties the recursion knot: accumulate calls the parser itself on another file. The parser needs to be abstracted (via an OCaml functor) over a file resolver, this is also done here. Finally this file holds a cache of already parsed files to preserve the old semantics. It also loads signature files when loading .mod files, for backward compatibility with Teyjus. The module type Parser is defined in parse.mli, and is all that clients should use.

The error messages are maintained by a few targets in the root Makefile starting with menhir-, mainly menhir-complete-errormsgs and menhir-strip-errormsgs.

Unit tests:

While the grammar is not extensible token families provide an open ended set of mixfix symbols. The relative precedence of a mixfix is given by its family which is identified by its prefix.

When this parser encounters an mixfix declaration it outputs an error along these lines:

Mixfix directives are not supported by this parser.

The parser is based on token families.
A family is identified by some starting characters, for example
a token '+-->' belongs to the family of '+'. There is no need
to declare it.

All the tokens of a family are parsed with the same precedence and
associativity, for example 'x +--> y *--> z' is parsed as
'x +--> (y *--> z)' since the family of '*' has higher precedence
than the family of '+'.

Here the table of tokens and token families.
Token families are represented by the start symbols followed by '..'.
Tokens of families marked with [*] cannot end with the starting symbol,
eg `foo` is not an infix, while `foo is.
The listing is ordered by increasing precedence.

fixity                     | tokens / token families
-------------------------- + -----------------------------------
Infix   not   associative  | :-  ?-  
Infix   right associative  | ;  
Infix   right associative  | ==>  
Infix   right associative  | ,  &  
Infix   right associative  | ->  
Infix   right associative  | =>  
Infix   not   associative  | =  ==  =<  r<  i<  s<  r=<  i=<  s=< 
                             <..  r>  i>  s>  r>=  i>=  s>=  >.. 
                             is  
Infix   right associative  | ::  
Infix   not   associative  | '.. [*]  
Infix   left  associative  | ^..  r+  i+  s+  +..  -  r-  i-  s-  
Infix   left  associative  | r*  i*  s*  *..  /  div  mod  
Infix   right associative  | --..  
Infix   not   associative  | `.. [*]  
Infix   right associative  | ==..  
Infix   right associative  | ||..  
Infix   right associative  | &&..  
Infix   left  associative  | #..  
Prefix  not   associative  | r~  i~  ~..  
Postfix not   associative  | ?..  

If the token is a valid mixfix, and you want the file to stay compatible
with Teyjus, you can ask Elpi to skip the directive. Eg:

% elpi:skip 2  // skips the next two lines
infixr ==> 120.
infixr || 120.

As a debugging facility one can ask Elpi to print the AST in order to
verify how the text was parsed. Eg:

echo 'MyFormula = a || b ==> c && d' | elpi -parse-term

Tokens of families marked with [*] cannot end with the starting symbol, eg foo is not an infix, while `foo is. This is to enable `foo` or 'foo' to be compiled as quoted identifiers, see the API Quotation.declare_backtick and Quotation.declare_singlequote.