Name	Name	Last commit message	Last commit date
parent directory ..
bin	bin
cst	cst
.gitignore	.gitignore
.npmignore	.npmignore
CHANGELOG.md	CHANGELOG.md
README.md	README.md
compile.ts	compile.ts
grammar.gg	grammar.gg
grammar.ts	grammar.ts
index.ts	index.ts
package.json	package.json
sort.ts	sort.ts
tarjan.ts	tarjan.ts
transform.ts	transform.ts
tsconfig.json	tsconfig.json

Parser generator

Not only other parser generators for web weren't written here, but they lack a set of features we really need:

Type-safety: API of generated parser should be typed without any
AST from grammar: converting untyped trees to AST is unsafe and boring
CST: pretty-printer has to keep comments /**/, underscores in numbers 1_234 and other features that are nowhere represented in AST.
Named lexemes: good error messages shouldn't report an identifier as "a-z, A-Z, 0-9, or _".
^TBD Error recovery: programming languages should report more than one error at a time.
^TBD Incremental: reparse shouldn't take time proprtional to size of the file.
High-order rules A<B>: duplicated code leads to increased chance to make a mistake, and high-order rules are required for duplication.
^TBD No stack overflow on large expressions: nested constructions might lead to stack overflow.
Space skipping: manually annotating grammar with spaces is error-prone and boring.

Comparison to peggy

pgen mostly follows grammar of peggy with a few notable differences.

Capitalized rules Foo = ... create AST nodes with { $: 'Foo' }.
Rules have to end with semicolon ;.
Inline semantic actions { return 42; } are not supported. We can't infer types of AST when there is some inlined JavaScript code, because JS is untyped.
High-order rules A<B> = ... were added.
Space skipping was added. It uses space rule.
Lexification operator # was added.
Character classes do not support modifiers [a-z]i.

Non-AST rule defintion rule = ...;
AST rule defintion Rule = .... Returns an object with { $: 'Rule', loc: Loc } with rest of the fields defined with named clauses in right-hand side.
Display override for error messaging Id "identifier" = ...;
High-order rule defintion inter<A, B> = ...; and call inter<expression, ",">
Left-biased choice "A" / "B". Will match the first matching clause.
Sequence foo bar baz. All clauses should match in sequence.
Named clauses "if" "(" expr:expression ")" stmts:statements. Sequence operator generates an object, and named clauses become its fields { expr: ..., stmts: ... }.
Picked clause "if" "(" @expression ")". Sequence operator returns only a single value of picked clause.
Single clause sequence a = b. Works as a = @b.
Negative lookahead !x. Fails if x matches. Doesn't consume input.
Positive lookahead &x. Passes if x matches. Doesn't consume input.
Stringification $x. Ignores AST computed by x, returns string that x matched.
Lexification #x. Does not skip spaces inside of x. If x calls some other rules, doesn't skip spaces there either.
Repeat x*.
Repeat at least once x+.
Optional x?.
String "abc".
Character class [a-z_]. Supports ranges a-z. Supports negation [^a-z].

yarn build

To generate AST parser:

./bin/pgen grammar.gg grammar.ts

To generate CST parser:

./bin/pgen grammar.gg grammar.ts --cst