Skip to content

Commit

Permalink
Merge pull request #456 from hildjj/library-mode
Browse files Browse the repository at this point in the history
Add library mode
  • Loading branch information
hildjj authored Jan 27, 2024
2 parents cbea181 + b3566fb commit 2e3cfd4
Show file tree
Hide file tree
Showing 44 changed files with 3,575 additions and 4,711 deletions.
3 changes: 2 additions & 1 deletion .eslintrc.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,11 @@ module.exports = {
ignorePatterns: [
"docs/",
"lib/parser.js", // Generated
"lib/compiler/passes/js-imports.js", // Generated
"examples/*.js", // Testing examples
"test/vendor/",
"test/cli/fixtures/bad.js", // Intentionally-invalid
"test/cli/fixtures/imports_peggy.js", // Generated
"test/cli/fixtures/lib.js", // Generated
"benchmark/vendor/",
"browser/",
"node_modules/",
Expand Down
8 changes: 8 additions & 0 deletions .ncurc.cjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"use strict";

module.exports = {
"reject": [
"chai", // Moved to es6
"@types/chai", // Should match chai
],
};
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ Released: TBD
names of the form `npm:<package-name>/<filename>` to load library rules from
an NPM package that is installed relative to the previous non-npm file name,
or to the current working directory if this is the first file name.
- [#456](https://github.com/peggyjs/peggy/pull/456) BREAKING: Allow imports
from external compiled grammars inside a source grammar, using `import
{rule} from "external.js"`. Note that this syntax will generate either
`import` or `require` in the JavaScript output, depending on the value of
the `format` parameter. This will need explicit support from
plugins, with a few new AST node types and a few visitor changes.

### Minor Changes

Expand Down
7 changes: 7 additions & 0 deletions bin/peggy-cli.js
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,7 @@ class PeggyCLI extends Command {
.choices(MODULE_FORMATS)
.default("commonjs")
)
.addOption(new Option("--library").hideHelp(), "Run tests in library mode. Maintainers only, for now.")
.option("-o, --output <file>", "Output file for generated parser. Use '-' for stdout (the default is a file next to the input file with the extension change to '.js', unless a test is specified, in which case no parser is output without this option)")
.option(
"--plugin <module>",
Expand Down Expand Up @@ -218,6 +219,11 @@ class PeggyCLI extends Command {
this.inputFiles = inputFiles;
this.argv = opts;

if (this.argv.library) {
this.peg$library = true;
delete this.argv.library;
}

if ((typeof this.argv.startRule === "string")
&& !this.argv.allowedStartRules.includes(this.argv.startRule)) {
this.argv.allowedStartRules.push(this.argv.startRule);
Expand Down Expand Up @@ -620,6 +626,7 @@ class PeggyCLI extends Command {

const opts = {
grammarSource: this.testGrammarSource,
peg$library: this.peg$library,
};
if (typeof this.progOptions.startRule === "string") {
opts.startRule = this.progOptions.startRule;
Expand Down
67 changes: 61 additions & 6 deletions docs/documentation.html
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ <h2 id="table-of-contents">Table of Contents</h2>
<li>
<a href="#grammar-syntax-and-semantics">Grammar Syntax and Semantics</a>
<ul>
<li><a href="#importing-external-rules">Importing External Rules</a></li>
<li><a href="#grammar-syntax-and-semantics-parsing-expression-types">Parsing Expression Types</a></li>
<li><a href="#action-execution-environment">Action Execution Environment</a></li>
<li><a href="#parsing-lists">Parsing Lists</a></li>
Expand Down Expand Up @@ -131,12 +132,9 @@ <h3 id="generating-a-parser-command-line">Command Line</h3>
<code>import</code> statements in the top-level initializers from each of the
inputs will be moved to the top of the generated code in reverse order of the
inputs, and all other top-level initializers will be inserted directly after
those imports, also in reverse order of the inputs. Note that because the
input JavaScript is parsed using a grammar very close to the ECMAscript
grammar, if import statements are found, all comments before the first
non-import statement will move with the import statement, which may be a
little surprising. This approach can be used to keep libraries of often-used
grammar rules in separate files.</p>
those imports, also in reverse order of the inputs. This approach can be used
to keep libraries of often-used grammar rules in
<a href="#importing-external-rules">separate files</a>.</p>

<p>By default, the generated parser is in the commonjs module format. You can
override this using the <code>--format</code> option.</p>
Expand Down Expand Up @@ -691,6 +689,63 @@ <h2 id="grammar-syntax-and-semantics">Grammar Syntax and Semantics</h2>
of strings containing digits, as its parameter. It joins the digits together to
form a number and converts it to a JavaScript <code>number</code> object.</p>

<h3 id="importing-external-rules">Importing External Rules</h3>

<p>Sometimes, you want to split a large grammar into multiple files for ease
of editing, reuse in multiple higher-level grammars, etc. There are two ways
to accomplish this in Peggy:</p>

<ol>
<li>
<p>From the <a href="#generating-a-parser-command-line">Command Line</a>,
include multiple source files. This will generate the least total amount
of code, since the combined output will only have the runtime overhead
included once. The resulting code will be slightly more performant, as
there will be no overhead to call between the rules defined in different
files at runtime. Finally, Peggy will be able to perform better checks
and optimizations across the combined grammar with this approach, since the
combination is applied before any other rules. For example:</p>
<p><code>csv.peggy</code>:</p>
<pre><code class="language-peggy">a = number|1.., "," WS|
WS = [ \t]*</code></pre>
<p><code>number.peggy</code>:</p>
<pre><code class="language-peggy">number = n:$[0-9]+ { return parseInt(n, 10); }</code></pre>
<p>Generate:</p>
<pre><code class="language-console">$ peggy csv.peggy number.peggy</code></pre>
</li>

<li>
<p>The downside of the CLI approach is that editor tooling will not be
able to detect that rules coming from another file -- references to such
rules will be shown with errors like <code>Rule "number" is not
defined</code>. Furthermore, you must rely on getting the CLI or API call
correct, which is not possible in all workflows.</p>
<p>The second approach is to use ES6-style <code>import</code> statements
at the top of your grammar to import rules into the local rule namespace.
For example:</p>
<p><code>csv_imp.peggy</code>:</p>
<pre><code class="language-peggy">import {number} from "./number.js"
a = number|1.., "," WS|
WS = [ \t]*</code></pre>
<p>Note that the file imported from is the compiled version of the
grammar, NOT the source. Grammars MUST be compiled by a version that
supports imports in order to be imported. Only rules that are allowed
start rules are valid. It can be useful to specify
<code>--allowed-start-rules \*</code> in library grammars. All of the
following are valid:</p>
<ul>
<li><code>import * as num from "number.js" // Call with num.number</code></li>
<li><code>import num from "number.js" // Calls the default rule</code></li>
<li><code>import {number, float} "number.js" // Import multiple rules by name</code></li>
<li><code>import {number as NUM} "number.js" // Rename the local rule to NUM to avoid colliding</code></li>
<li><code>import {"number" as NUM} "number.js" // Valid in ES6</code></li>
<li><code>import integer, {float} "number.js" // The default rule and some named rules</code></li>
<li><code>import from "number.js" // Just the top-level initializer side-effects</code></li>
<li><code>import {} "number.js" // Just the top-level initializer side-effects</code></li>
</ul>
</li>
</ol>

<h3 id="grammar-syntax-and-semantics-parsing-expression-types">Parsing Expression Types</h3>

<p>There are several types of parsing expressions, some of them containing
Expand Down
4 changes: 2 additions & 2 deletions docs/js/benchmark-bundle.min.js

Large diffs are not rendered by default.

22 changes: 16 additions & 6 deletions docs/js/examples.js
Original file line number Diff line number Diff line change
Expand Up @@ -235,16 +235,16 @@ function peg$parse(input, options) {
var peg$f26 = function() { return location(); };
var peg$f27 = function(match, rest) { return {match, rest}; };
var peg$f28 = function(match, rest) { return {match, rest}; };
var peg$currPos = 0;
var peg$savedPos = 0;
var peg$currPos = options.peg$currPos | 0;
var peg$savedPos = peg$currPos;
var peg$posDetailsCache = [{ line: 1, column: 1 }];
var peg$maxFailPos = 0;
var peg$maxFailExpected = [];
var peg$silentFails = 0;
var peg$maxFailPos = peg$currPos;
var peg$maxFailExpected = options.peg$maxFailExpected || [];
var peg$silentFails = options.peg$silentFails | 0;

var peg$result;

if ("startRule" in options) {
if (options.startRule) {
if (!(options.startRule in peg$startRuleFunctions)) {
throw new Error("Can't start parsing from rule \"" + options.startRule + "\".");
}
Expand Down Expand Up @@ -1496,6 +1496,15 @@ function peg$parse(input, options) {

peg$result = peg$startRuleFunction();

if (options.peg$library) {
return /** @type {any} */ ({
peg$result,
peg$currPos,
peg$FAILED,
peg$maxFailExpected,
peg$maxFailPos
});
}
if (peg$result !== peg$FAILED && peg$currPos === input.length) {
return peg$result;
} else {
Expand All @@ -1514,6 +1523,7 @@ function peg$parse(input, options) {
}

root.peggyExamples = {
StartRules: ["literal", "literal_i", "any", "class", "not_class_i", "rule", "child", "paren", "paren_pluck", "star", "plus", "repetition", "maybe", "posAssertion", "negAssertion", "posPredicate", "negPredicate", "dollar", "label", "pluck_1", "pluck_2", "sequence", "action", "alt", "rest"],
SyntaxError: peg$SyntaxError,
parse: peg$parse
};
Expand Down
4 changes: 2 additions & 2 deletions docs/js/test-bundle.min.js

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/vendor/peggy/peggy.min.js

Large diffs are not rendered by default.

5 changes: 5 additions & 0 deletions lib/compiler/asts.js
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,11 @@ const asts = {
return rule ? consumes(rule) : undefined;
},

library_ref() {
// No way to know for external rules.
return false;
},

literal(node) {
return node.value !== "";
},
Expand Down
8 changes: 8 additions & 0 deletions lib/compiler/index.js
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
"use strict";

const addImportedRules = require("./passes/add-imported-rules");
const fixLibraryNumbers = require("./passes/fix-library-numbers");
const generateBytecode = require("./passes/generate-bytecode");
const generateJS = require("./passes/generate-js");
const inferenceMatchResult = require("./passes/inference-match-result");
const removeProxyRules = require("./passes/remove-proxy-rules");
const mergeCharacterClasses = require("./passes/merge-character-classes");
const reportDuplicateImports = require("./passes/report-duplicate-imports");
const reportDuplicateLabels = require("./passes/report-duplicate-labels");
const reportDuplicateRules = require("./passes/report-duplicate-rules");
const reportInfiniteRecursion = require("./passes/report-infinite-recursion");
Expand Down Expand Up @@ -49,15 +52,20 @@ const compiler = {
// or modify it as needed. If the pass encounters a semantic error, it throws
// |peg.GrammarError|.
passes: {
prepare: [
addImportedRules,
],
check: [
reportUndefinedRules,
reportDuplicateRules,
reportDuplicateLabels,
reportInfiniteRecursion,
reportInfiniteRepetition,
reportIncorrectPlucking,
reportDuplicateImports,
],
transform: [
fixLibraryNumbers,
removeProxyRules,
mergeCharacterClasses,
inferenceMatchResult,
Expand Down
77 changes: 77 additions & 0 deletions lib/compiler/intern.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
// @ts-check
"use strict";

/**
* Intern strings or objects, so there is only one copy of each, by value.
* Objects may need to be converted to another representation before storing.
* Each inputs corresponds to a number, starting with 0.
*
* @template [T=string],[V=T]
*/
class Intern {
/**
* @typedef {object} InternOptions
* @property {(input: V) => string} [stringify=String] Represent the
* converted input as a string, for value comparison.
* @property {(input: T) => V} [convert=(x) => x] Convert the input to its
* stored form. Required if type V is not the same as type T. Return
* falsy value to have this input not be added; add() will return -1 in
* this case.
*/

/**
* @param {InternOptions} [options]
*/
constructor(options) {
/** @type {Required<InternOptions>} */
this.options = {
stringify: String,
convert: x => /** @type {V} */ (/** @type {unknown} */ (x)),
...options,
};
/** @type {V[]} */
this.items = [];
/** @type {Record<string, number>} */
this.offsets = {};
}

/**
* Intern an item, getting it's asssociated number. Returns -1 for falsy
* inputs. O(1) with constants tied to the convert and stringify options.
*
* @param {T} input
* @return {number}
*/
add(input) {
const c = this.options.convert(input);
if (!c) {
return -1;
}
const s = this.options.stringify(c);
let num = this.offsets[s];
if (num === undefined) {
num = this.items.push(c) - 1;
this.offsets[s] = num;
}
return num;
}

/**
* @param {number} i
* @returns {V}
*/
get(i) {
return this.items[i];
}

/**
* @template U
* @param {(value: V, index: number, array: V[]) => U} fn
* @returns {U[]}
*/
map(fn) {
return this.items.map(fn);
}
}

module.exports = Intern;
2 changes: 2 additions & 0 deletions lib/compiler/opcodes.js
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ const opcodes = {
// Rules

RULE: 27, // RULE r
LIBRARY_RULE: 41, // LIBRARY_RULE moduleIndex, whatIndex

// Failure Reporting

Expand All @@ -76,6 +77,7 @@ const opcodes = {
SOURCE_MAP_POP: 38, // SOURCE_MAP_POP
SOURCE_MAP_LABEL_PUSH: 39, // SOURCE_MAP_LABEL_PUSH sp, literal-index, loc-index
SOURCE_MAP_LABEL_POP: 40, // SOURCE_MAP_LABEL_POP sp
// LIBRARY_RULE: 41,
};

module.exports = opcodes;
52 changes: 52 additions & 0 deletions lib/compiler/passes/add-imported-rules.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
// @ts-check
"use strict";

/**
* Generate trampoline stubs for each rule imported into this namespace.
*
* @example
* import bar from "./lib.js" // Default rule imported into this namespace
* import {baz} from "./lib.js" // One rule imported into this namespace by name
*
* @type {PEG.Pass}
*/
function addImportedRules(ast) {
let libraryNumber = 0;
for (const imp of ast.imports) {
for (const what of imp.what) {
let original = undefined;
switch (what.type) {
case "import_binding_all":
// Don't create stub.
continue;
case "import_binding_default":
// Use the default (usually first) rule.
break;
case "import_binding":
original = what.binding;
break;
case "import_binding_rename":
original = what.rename;
break;
default:
throw new TypeError("Unknown binding type");
}
ast.rules.push({
type: "rule",
name: what.binding,
nameLocation: what.location,
expression: {
type: "library_ref",
name: original,
library: imp.from.module,
libraryNumber,
location: what.location,
},
location: imp.from.location,
});
}
libraryNumber++;
}
}

module.exports = addImportedRules;
Loading

0 comments on commit 2e3cfd4

Please sign in to comment.