Framing issue: goals and roadmap of cooklang-c #23
Description
Goals
C is the mother of all programming languages. Most of other languages support C code execution in one way or another. Hence the parser written in C can be wrapped in in other languages and require very little effort from us to provide consistent Cooklang support in variety of popular languages like Python, Ruby, Java, Go, NodeJS, etc. In addition it makes Cooklang evolution easier and reduces repeatability in required code updates for these parsers. In long-term this parser will replace Swift one in CLI.
Design principles
We want to make an end-user of downstream parsers experience the easiest. What it means is that if I'm, for example, a user of Cooklang Python module I shouldn't do anything else other than regular package installation and import to my project. Behind the scenes it should compile C extension for me.
That means:
- the repo should host C-file(s) of parser with no other system dependencies except standard. We want to avoid third party system library installation.
- we want to move away from filesystem level and operate in memory only (hence working only with strings). No cook file readings from C code. We move this responsibility to work with files to downstream parsers or apps.
- C-invocation should be pure. Considering that many languages have automated Garbage Collection we need to provide a way for proper management of objects life-cycle.
Roadmap
Here I outline first steps which are basically to make this parser production ready:
- Move Python extension related files to https://github.com/cooklang/cooklang-py and setup there a git-submodule which uses this parser. #26
- Restructure README and CONTRIBUTING pages. README should show how C-parser users can use it and cover input/output. We don't expect them to know anything about flex/bison. They just use precompiled version. CONTRIBUTING page should reference how people could make changes to C-parser project itself. There we cover flex/bison and friends. #29
- Create a GitHub action to build the parser from flex/bison definitions, including UTF related tables.
- Add support for cook files with syntax errors. Definitely we don't want to crash. We need to do our best effort and guess some typical errors while reporting in data-structure position and expected syntax.
- Setup unit test framework and add other tests other than canonical.
- Consider using code generation instead of iteration over canonical tests definition for readability. #25
- Create common helper to combine ingredients with lemmatisation support.
After that we need to support features newly added into the spec (rough idea https://github.com/cooklang/CookCLI/milestones?direction=asc&sort=title&state=open). There's a chance of changing this parser to multi-pass parser to support servings (first pass for metadata and second for a recipe and adjusting amounts to scaled one).
Challenges
- UTF support
- evaluate Flex/Bison if it withstand future challenges:
- not flexible enough to implement forgiveness?
- error handling?
- performance? generated parser file is huge (12Mb)!