Complete syntax reference for descent parser specifications.
For character literal syntax, see characters.md.
|parser <name> ; Required: parser name
|type[<TypeName>] <KIND> ; Type declarations (zero or more)
|entry-point /<function> ; Required: where parsing begins
|keywords[<name>] ... ; Keyword blocks (zero or more)
|function[<name>] ... ; Function definitions (one or more)
Types determine what events are emitted for functions returning that type.
|type[<Name>] <KIND>
| KIND | On Entry | On Return |
|---|---|---|
| BRACKET | Emit NameStart |
Emit NameEnd |
| CONTENT | MARK position |
Emit Name with content |
| INTERNAL | Nothing | Nothing (internal use) |
Examples:
|type[Element] BRACKET ; ElementStart on entry, ElementEnd on return
|type[Text] CONTENT ; MARK on entry, emit Text with span on return
|type[Counter] INTERNAL ; No emit - for internal values only
|entry-point /<function>
Specifies which function begins parsing. The leading / is required.
|function[<name>] ; Void function (no auto-emit)
|function[<name>:<Type>] ; Returns/emits Type
|function[<name>:<Type>] :<param> ; With one parameter
|function[<name>:<Type>] :p1 :p2 ; Multiple parameters
Actions on the function line execute once on function entry:
|function[<name>] | <action> | <action>
Example:
|function[brace_comment] | depth = 1
Parameter types are inferred from usage:
| Usage | Inferred Type |
|---|---|
|c[:param] |
:byte (u8) |
PREPEND(:param) |
:bytes (&[u8]) |
| Arithmetic/conditions | :i32 |
|state[:<name>]
<cases...>
States contain cases that match input and execute actions.
Cases have the form: |<match> |<substate> | <actions> |<transition>
| Syntax | Matches |
|---|---|
c[x] |
Single character x |
c[abc] |
Any of a, b, or c |
c['\n'] |
Quoted character (newline) |
c[<0-9>] |
Character class (digits) |
c[:param] |
Byte parameter value |
LETTER |
ASCII letter (a-z, A-Z) |
DIGIT |
ASCII digit (0-9) |
HEX_DIGIT |
Hex digit (0-9, a-f, A-F) |
LABEL_CONT |
Letter, digit, _, or - |
default |
Fallback (any other byte) |
eof |
End of input |
if[<cond>] |
Conditional guard |
| (empty) | Bare action (unconditional) |
Optional label for debugging (appears in trace output):
|c[x] |.found | -> |>> :next ; .found is the substate label
Actions are pipe-separated and execute left-to-right:
| Action | Description |
|---|---|
-> |
Advance one byte |
->[<chars>] |
Advance TO first occurrence (SIMD scan) |
MARK |
Mark position for accumulation |
TERM |
Terminate slice (MARK to current) |
TERM(-N) |
Terminate excluding last N bytes |
/<func> |
Call function |
/<func>(<args>) |
Call with arguments |
/error |
Emit error event |
/error(<Code>) |
Emit error with custom code |
<var> = <value> |
Assignment |
<var> += <N> |
Increment |
PREPEND('<lit>') |
Prepend literal to accumulation |
PREPEND(:param) |
Prepend parameter bytes |
KEYWORDS(<name>) |
Lookup in keyword map |
<Type> |
Emit event with no payload |
<Type>('<lit>') |
Emit event with literal value |
<Type>(USE_MARK) |
Emit event with accumulated content |
->[<chars>] uses SIMD memchr and supports:
- 1-6 literal bytes only
- Quoted characters:
->['\n\t'] - NO character classes, NO parameter refs
| Syntax | Description |
|---|---|
|>> |
Self-loop (stay in current state) |
|>> :<state> |
Go to named state |
|return |
Return from function |
Single-line guards (no block structure):
|if[<condition>] | <actions> |<transition>
Followed by fallthrough case:
|if[COL <= :col] |return
| |>> :continue ; else branch
- Comparisons:
==,!=,<,<=,>,>= - Variables:
COL,LINE,PREV,:param, local vars - Parentheses allowed:
(COL == 1)
| Variable | Type | Description |
|---|---|---|
COL |
i32 | Current column (1-indexed) |
LINE |
i32 | Current line (1-indexed) |
PREV |
byte | Previous byte (0 at start) |
Perfect hash lookup for keyword matching:
|keywords[<name>] :fallback /<func>
| <keyword> => <EventType>
| <keyword> => <EventType>
Usage:
|default | TERM | KEYWORDS(<name>) |return
Example:
|keywords[bare] :fallback /identifier
| true => BoolTrue
| false => BoolFalse
| null => Nil
Semicolon starts a comment (rest of line ignored):
|parser test ; this is a comment
See characters.md for complete specification.
| Syntax | Description |
|---|---|
'x' |
Single character |
'\n' |
Escape sequence |
'\xHH' |
Hex byte |
<abc> |
Character class (a, b, c) |
<0-9> |
Predefined range (digits) |
<LETTER> |
Predefined class |
<P> |
DSL-reserved char (|) |
| Name | Characters |
|---|---|
LETTER |
a-z, A-Z |
DIGIT |
0-9 |
HEX_DIGIT |
0-9, a-f, A-F |
LABEL_CONT |
LETTER + DIGIT + _ + - |
WS |
Space + tab |
NL |
Newline |
| Name | Char | Name | Char |
|---|---|---|---|
P |
| |
SQ |
' |
L |
[ |
DQ |
" |
R |
] |
BS |
\ |
LB |
{ |
LP |
( |
RB |
} |
RP |
) |
|parser json_value
|type[StringValue] CONTENT
|type[Object] BRACKET
|entry-point /value
|keywords[kw] :fallback /identifier
| true => BoolTrue
| false => BoolFalse
| null => Nil
|function[value]
|state[:dispatch]
|c['"'] | -> | /string_value |return
|c['{'] | -> | /object |return
|LETTER | /bare_keyword |return
|default | /error |return
|function[string_value:StringValue]
|state[:main]
|c['"'] | -> |return ; Close quote
|c['\\'] | -> | -> |>> ; Escape: skip 2
|default | -> |>> ; Collect
|function[object:Object]
|state[:main]
|c['}'] | -> |return ; Close brace
|c['"'] | -> | /string_value |>> :after
|WS | -> |>>
|default | /error |return
|state[:after]
|c[':'] | -> | /value |>> :comma
|WS | -> |>>
|default | /error |return
|state[:comma]
|c[','] | -> |>> :main
|c['}'] | -> |return
|WS | -> |>>
|default | /error |return
|function[bare_keyword]
|state[:main]
|LETTER | -> |>>
|default | TERM | KEYWORDS(kw) |return