Skip to content

Conversation

@ly3xqhl8g9
Copy link

Description

This PR adds full Romanian language support to Catala. The implementation follows the existing pattern for multi-language support and includes all necessary components for a complete language integration.

  • Complete Romanian translation of all Catala keywords
  • Proper handling of Romanian-specific characters using UTF-8 escape sequences
  • All operators and language constructs properly localized
  • Added Romanian lexer to the dune build configuration
  • Integrated Romanian as a backend option in clerk_driver.ml
  • Added Romanian language option to CLI (--language=ro)
  • Fixed missing Romanian case in compiler/desugared/name_resolution.ml
  • Ensured all language-dependent pattern matches include Romanian
  • Created Romanian syntax documentation (doc/syntax/syntax_ro.catala_ro)
  • Added Romanian syntax sheet build target in Makefile
  • Updated CONTRIBUTING.md to mention Romanian support
  • Added complete syntax highlighting support for Romanian (syntax_highlighting/ro/)
  • Created Romanian test files covering arithmetic, boolean logic, conditionals, and literate programming
  • All 627 tests pass with Romanian language support enabled

Checklist

If this PR adds a feature or has breaking changes

If this PR contains syntax changes

I confirm that have have checked and updated each of the following items if this PR impacts them:

@AltGr
Copy link
Contributor

AltGr commented Aug 6, 2025

Wow, that was unexpected ! Thanks a lot for the interest and effort, this is really appreciated.

Since we are a small team developing and maintaining the software, I must add a few preliminary words of caution though:

  • we won't be able to maintain this translation ourselves since none of us speak the language (as is the case for the polish one), so any update we make that affects the syntax will be best-effort regarding the Romanian syntax. In other words, we might break it and you may have to follow on the master branch to fix the Romanian version afterwards.
  • the tool hasn't reached version 1.0, but that's planned in the short term, so there might be significant changes coming in the next months. For example, I am in the process of adding a standard library (that may replace some of the built-in operators) : I can only do that for French and English, so source files in Romanian will by default use the English library.

With this out of the way, it seems you have been pretty thorough in your patch and we'll be happy to merge it -- one change I would request would be to remove the Romanian language tests from the main code base, since that's automatically included in our CI checks while we won't be able to maintain them.

@ly3xqhl8g9 ly3xqhl8g9 force-pushed the add-romanian-language branch from 9a436e9 to ff98ed9 Compare August 6, 2025 09:44
@ly3xqhl8g9
Copy link
Author

The tests have been removed, didn't realize they are automatically included.

Of course, I will make sure to maintain the Romanian implementation, breaking changes notwithstanding.

@ly3xqhl8g9 ly3xqhl8g9 force-pushed the add-romanian-language branch 3 times, most recently from 890230c to 3c0168a Compare August 6, 2025 10:37
@AltGr
Copy link
Contributor

AltGr commented Aug 6, 2025

(don't worry about the CI failures, we're updating our docker images and runners, I'll merge as soon as it's restored)

@AltGr
Copy link
Contributor

AltGr commented Aug 25, 2025

Sorry about the lack of follow-up ; we have been making some breaking changes in the lexer/parser (for the better, for example no patch should be needed in name_resolution.ml anymore). It's not completely stabilised yet (and #852 has yet to be merged), so if you are still willing to update this, I'll notify you once the ground is a bit more stable in order not to waste your time.

@ly3xqhl8g9
Copy link
Author

Sure, not a problem. Make all the breaking changes as needed. Sorry I cannot help you with more in depth PRs on the codebase itself, still much to learn.

@AltGr
Copy link
Contributor

AltGr commented Oct 9, 2025

Hi! Sorry for the delay ; the syntax changes should be stabilised now -- the translated keywords should now only appear in two places:

  • the lexer_*.cppo.ml file that you already updated (there have been some minor changes)
  • the new stdlib, that we are still writing. The good side of this is that it's outside of the compiler and can be completed progressively. You'll see that in stdlib/ there are stdlib_xx.catala_xx modules that will be selected automatically when using files in language xx (that's handled from compiler/driver.ml). That module imports aliases of all the stdlib submodules, so that e.g. Money_xx can be used as just Money ; but if it's not yet translated, you are allowed to rely on the reference (english) stdlib in any language.

@ly3xqhl8g9 ly3xqhl8g9 force-pushed the add-romanian-language branch 2 times, most recently from 44f2aa4 to dd97f30 Compare October 17, 2025 16:52
@ly3xqhl8g9 ly3xqhl8g9 force-pushed the add-romanian-language branch from dd97f30 to d596a1a Compare October 18, 2025 08:21
@ly3xqhl8g9
Copy link
Author

ly3xqhl8g9 commented Oct 18, 2025

I have updated the branch.

Running make build_dev hits a bug because of GNU vs BSD sed in the build system, even if it successfully builds:

In build_system/dune lines 58-66, there's a rule that regenerates manpages.sexp:

%{bin:clerk} 2>&1 | sed -n "s/\(^\|'\)[^']*\('\|$\)/ /gp;q"

This sed command is supposed to extract command names from clerk's help output, but it's failing and returning nothing.

When the extraction fails:

  • The for loop only runs once (with empty $page)
  • Only generates 1 rule (for base clerk command)
  • The (mode promote) writes this truncated version

Old command (GNU-only):
sed -n "s/\(^\|'\)[^']*\('\|$\)/ /gp;q"

  • Uses \( \) for groups ← Only works on GNU sed
  • Fails silently on BSD sed (macOS)

New command (Portable):
sed -E "s/[^']*'([^']*)'[^']*/\1 /g" | head -1

  • Uses -E flag ← Supported by both BSD and GNU
  • Uses ( ) for groups in extended mode

@AltGr
Copy link
Contributor

AltGr commented Oct 18, 2025

Thanks a lot! The hand on making our build script portable is very much appreciated too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants