Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C backend for Catala #204

Closed
denismerigoux opened this issue Feb 18, 2022 · 6 comments · Fixed by #671
Closed

C backend for Catala #204

denismerigoux opened this issue Feb 18, 2022 · 6 comments · Fixed by #671
Assignees
Labels
🔚 backends Backend runtime or code generation 🔧 compiler Issue concerns the compiler ✨ enhancement New feature or request
Milestone

Comments

@denismerigoux
Copy link
Contributor

denismerigoux commented Feb 18, 2022

Once #158 is merged, a C backend can easily be implemented for Catala, starting from the Scalc intermediate representation. The code should roughly follow the structure of the Python backend.

Advice from @protz : start with the simple stuff first and then implement optimizations.

Enum translation

They should be translated as tagged unions. Be careful: null cannot be distinguished from an integer.

Structs

We can use struct literals of C11? But inefficient because call by value, we should call by reference but we have to do memory management (or they're all scoped locally unless your return them as done in the wasm backend of KreMLin).

Closures translation

They should be translated to function pointers after a closure conversion pass is done (closure conversion and hoisting can maybe be implemented in Lcalc).

Integers

Use GMP? But these values are heap allocated, so we need memory management... Checked u64 that error out on overflow? Have two C backends: a high-level C backend and a very low-level backend targeting GPU-isation of code.

@AltGr
Copy link
Contributor

AltGr commented Jul 12, 2024

Milestone achieved: I just successfully compiled and ran tests/arithmetic/good/rounding.catala_en.

I diverged significantly from the first draft, heading towards a simplified solution that could be improved later on:

  1. maybe the biggest change is that I disabled monomorphisation: we'll need it at some point, in particular for equality testing, but it makes everything quite a bit more complex than just handling arrays and options as structs over void* pointers, in a polymorphic way.

  2. I used GMP and implemented all of our operators over it (with a "functional" layer, more on this below)

  3. everything is a pointer: integers, structures, etc. We need them in quite a few places, the base types are pointers (allocated by GMP) anyway, and this simplifies things a lot. This makes point 1. pretty reasonable, call-by-reference is obvious and return values of functions can be handled normally. Of course, there is an indirection cost, and it'll probably be a good idea to carefully unbox specific cases at some point; but I think this is the best starting point. Mixing pointers with unboxed values will probably require extra care that is best done independently.

    Note that almost everything is tagged const, and that the C compiler might be able to do some magic.

  4. I am using a very dumb allocator that works like an eternally expanding small heap. Since we are very unlikely to be bound by memory, and are in a functional-like context where lots of very small allocations are done (gmp allocates 16 bytes for every integer). This should be extremely fast, and balance out some of the costs from having pointers everywhere. The allocated space is freed in bulk at the end of the computation.

  5. This is all tested with C89 and --pedantic ; it turns out structure literals don't really exist in C89 (only at declaration, only with constants, and without field names). All declaration also need to be lifted to the top of the block; so we proceed with some_struct* foo; [...]; foo = catala_malloc(sizeof(some_struct)); foo->field1 = ..., etc.

The base types are implemented as such:

  • options have a tag and a void* pointer (with the correct typecast done at construction and deconstruction time, which is validated by te catala type system)
  • arrays similarly have a size field, and a void** pointer that is a C array of pointers to the elements
  • tuples will be a fixed-size array of void* pointers, with the proper casts introduced like for option types

There is still a lot to do but this approach seems solid ; in particular, the handling of default terms and exceptions seems to be working.

@AltGr
Copy link
Contributor

AltGr commented Jul 12, 2024

@denismerigoux
Copy link
Contributor Author

@AltGr this is amazing, thanks so much for working out the details of how we get to a working C code. Indeed, this is a change of philosophy from my original draft. I guess your proposal is better in terms of getting to correctness quickly, and I'm glad you switched to it. In thr future though, we might want an alternative C backend way less functional in style where everything is monomorphized, all the data structures flat, allocations minimized and GMP dropped in favor of 128-bit integers and fixed precision points. This alternative backend will be more convenient for interoperability with COBOL codebases...

@AltGr
Copy link
Contributor

AltGr commented Jul 12, 2024

Yes, I expected as much, and indeed I think it'll be easier to transition to a lower-level backend from that first working one. It's probably possible to allocate almost everything statically on the stack and use reference passing everywhere, with return values allocated from the parent (GMP is mostly designed this way, although it obviously still requires dynamic allocations): somehow moving from "only allocated pointers" to "no pointers". It'll be harder to get right, though, hence that first version.

@AltGr
Copy link
Contributor

AltGr commented Jul 15, 2024

Tracking plans for that "idiomatic" backend in #648

@denismerigoux
Copy link
Contributor Author

Done, we only need to merge PRs now.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Catala - language & tooling Sep 25, 2024
@denismerigoux denismerigoux linked a pull request Oct 2, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔚 backends Backend runtime or code generation 🔧 compiler Issue concerns the compiler ✨ enhancement New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants