Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined C++ behavior on intToUnboxedInt() #873

Open
vitor1001 opened this issue Jan 27, 2025 · 2 comments
Open

Undefined C++ behavior on intToUnboxedInt() #873

vitor1001 opened this issue Jan 27, 2025 · 2 comments

Comments

@vitor1001
Copy link

According to the C++ specification, most code manipulating invalid pointers is invalid. Quoting the spec:

Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior. Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.

MiniZinc is using a Tagged pointer optimization to store integer constants, but it uses a variable of type Expression* to store the tagged pointer. This cannot be made valid with C++. To use a tagged pointer in C++ one needs to store the pointer-or-value value in a uintptr_t instead.

Since the compiler can compile undefined behavior to anything, this can result in it trying to do something "fast", but very different from the intent.

More concretely, this makes MiniZinc crash if Clang's UndefinedBehaviorSanitizer is enabled.

@guidotack
Copy link
Member

We are aware of this, but changing this would require a major refactoring of the code base. While it's undefined behaviour, all compilers and platforms we have tested on are currently producing the intended behaviour (and it would be very obvious if they didn't, since the compiler would simply crash immediately).

We have been careful to make sure that the only thing that happens (and the only thing the UndefinedBehaviorSanitizer complains about) is up- and downcasts of unaligned pointers, i.e., we never dereference an unaligned pointer or call a member function through an unaligned pointer (which actually result in incorrect code being generated).

It would be possible to replace all casts of x to type T* with additional reinterpret_cast<T*>(reinterpret_cast<uintptr_t>(x)) calls, which would get rid of the UndefinedBehaviorSanitizer warnings. But I'm not sure it's worth the effort at this point. If you'd like to give it a go and submit a pull request, we're happy to take a look.

The cleaner approach would be to have a dedicated ExpressionValue class for passing around, which would contain the uintptr_r (or other sufficiently sized) value, so that no casts are required anywhere. That would really require a complete rewrite of the entire code base, and this doesn't seem to be important enough to justify the effort.

@vitor1001
Copy link
Author

Hi Guido!

I indeed realized it would be a lot of work to fix it cleanly, otherwise I would have sent my bug report in the form of a pull request :) I also opened the issue to make sure it is tracked and documented.

I also have not seen any real bug coming from this and I only saw a single another issue that was caught by compiler tooling. For this other issue you can have a look at pull request #874.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants