Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

semantic integers values? (SUBATOMIC ABI ANALYSIS) #44

Open
Gankra opened this issue Jul 1, 2024 · 0 comments
Open

semantic integers values? (SUBATOMIC ABI ANALYSIS) #44

Gankra opened this issue Jul 1, 2024 · 0 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@Gankra
Copy link
Owner

Gankra commented Jul 1, 2024

This idea is impractical in terms of effort-to-reward but I think about it a lot so I'm writing it down.

Problem

Ok so abi-cafe considers it sufficient for the two sides of a call to find the same bytes when asked to report the the bytes of a primitive. This largely proves (over sufficiently many tests) that everything agrees on offsets, aligns, and sizes.

What it doesn't prove is that:

  • The two sides agrees on endianess
  • The two sides agree on signedness
  • The two sides agree on repr (twos complement, ones complement, ...)
  • The two sides agree on type (maybe one thinks this is a bool while the other thinks it's a u8)

Most of these are fairly far-fetched mistakes for an even vaguely working ABI (you'll probably figure out these are wrong very fast). But, hey, we could try to catch these things!

To put it metaphorically: our analysis views primitives as atoms, but we could do subatomic analysis.

We already do this kind of subatomic semantic analysis for tagged unions, where we introduce an artificial tag field reflecting which case the language thinks the variant is in (set by performing an if-let).

Approach

The general approach here is to introduce a challenge-response system to the codegen/abi. That is, the caller will generate the value, report it, and then modify it in some reversible way. The callee is then expected to reverse the operation, and then report the restored value.

So for instance, if we had x: u32 = 0x1234_5678, we could do x = 0x1234_5678 + 0x6789_3589, with the callee expected to compute x - 0x6789_3589. Successfully restoring 0x1234_5678 here would pretty strongly indicate we're using the same representation.

Note that importantly 0x6789_3589 is not passed, both sides are generated with a literal to the effect of 0x6789_3589, ensuring that they both agree that they are working with that particular value. This essentially results in us comparing the native semantics of the literal to the transmute semantics of the ABI -- which ideally should match. It's relatively easy for us to do this kind of "generate both sides with the same assumptions" thing, as the ValueTree exists to do exactly that, and is already used in this way for our subatomic tagged union handling.

In this particular example we can be pretty confident that endianess now matches, and that we aren't in a weird u8 ~= bool situation. Signedness is still technically up in the air, and I don't know enough about non-twos-complement to say if we really tested for that.

Checking for sign while being reversible is harder, if you don't want to rely on detecting overflows at runtime. I'm sure someone could come up with some fun match tricks, but this is as far as I'm willing to take the thought experiment. Also no idea how to do float stuff safely.

@Gankra Gankra added enhancement New feature or request question Further information is requested labels Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant