Basic support for FFI callbacks (at least stop segfaulting and report a proper error) #4728

weiznich · 2025-11-27T18:28:15Z

This commit adds basic support for FFI callbacks by registering a shim function via libffi. This shim function currently only notes if it's actually called and registers an error for these cases. The main motivation for this is to prevent miri segfaulting as described in #4639. Obviously the code is likely highly unsafe, especially if something goes wrong, etc as it's going back and forth over an ffi boundary and just casts pointers as it likes.

In the future miri could try to continue execution in the registered callback, although as far as I understand Ralf that is no easy problem. There are already preparations for this, like actually receiving the arguments and setting up the structure to return something.

This produces the following error for diesel:

error: unsupported operation: Tried to call a function pointer via FFI boundary. That's not supported yet by miri
        This function pointer was registered by a call to `sqlite3_create_function_v2` using an argument of the type `std::option::Option<unsafe extern "C" fn(*mut std::ffi::c_void)>`
    --> /home/weiznich/Documents/rust/diesel/diesel/src/sqlite/connection/raw.rs:270:37
     |
 270 |         let close_result = unsafe { ffi::sqlite3_close(self.internal_connection.as_ptr()) };
     |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unsupported operation occurred here
     |
     = help: this is likely not a bug in the program; it indicates that the program performed an operation that Miri does not support
     = note: BACKTRACE on thread `types::i32_to_s`:
     = note: inside `<diesel::sqlite::connection::raw::RawConnection as std::ops::Drop>::drop` at /home/weiznich/Documents/rust/diesel/diesel/src/sqlite/connection/raw.rs:270:37: 270:90
     = note: inside `std::ptr::drop_in_place::<diesel::sqlite::connection::raw::RawConnection> - shim(Some(diesel::sqlite::connection::raw::RawConnection))` at /home/weiznich/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:805:1: 807:25
     = note: inside `std::ptr::drop_in_place::<diesel::SqliteConnection> - shim(Some(diesel::SqliteConnection))` at /home/weiznich/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:805:1: 807:25
note: inside `types::query_to_sql_equality::<diesel::sql_types::Integer, i32>`
    --> diesel_tests/tests/types.rs:1428:1
     |
1428 | }
     | ^
note: inside `types::i32_to_sql_integer`
    --> diesel_tests/tests/types.rs:192:13
     |
 192 |     assert!(query_to_sql_equality::<Integer, i32>("0", 0));
     |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside closure
    --> diesel_tests/tests/types.rs:191:24
     |
 190 | #[diesel_test_helper::test]
     | --------------------------- in this attribute macro expansion
 191 | fn i32_to_sql_integer() {
     |                        ^
     = note: this error originates in the attribute macro `test` which comes from the expansion of the attribute macro `diesel_test_helper::test` (in Nightly builds, run with -Z macro-backtrace for more info)

I feel that's better than nothing, although a better error would include trace information about the registering side and the call side as well.

There are no tests for this yet, as I'm not familiar with miri's test setup, how to structure them and where to put them. Any pointers for this would be helpful.

rustbot · 2025-11-27T18:28:18Z

Thank you for contributing to Miri! A reviewer will take a look at your PR, typically within a week or two.
Please remember to not force-push to the PR branch except when you need to rebase due to a conflict or when the reviewer asks you for it.

This commit adds basic support for FFI callbacks by registering a shim function via libffi. This shim function currently only notes if it's actually called and registers an error for these cases. The main motivation for this is to prevent miri segfaulting as described in [4639](rust-lang#4639). In the future miri could try to continue execution in the registered callback, although as far as I understand Ralf that is no easy problem.

oli-obk · 2025-11-28T07:32:26Z

src/shims/native_lib/mod.rs

            either::Either::Left(mplace) => {
+                let ptr_overwrite = match v.layout.ty.kind() {
+                    ty::Adt(_adt_def, args) =>
+                        if let ty::FnPtr(fn_ptr, _header) = args.type_at(0).kind() {


What is this adt + fndef match detecting? I assumed you'd just want to detect fn ptrs being converted to FFI data and inject your callback there, but that doesn't seem to be what is happening.

You are correct that this is likely wrong. The underlying problem there is that this does not only need to handle plain function pointers but also something like Option<unsafe fn()>. Now I do not know much about the involved compiler types, so that is my crude attempt to get the relevant information. If you know a better way I'm certainly open for suggestions.

(Beside of that it seems like I did not actually handle the plain function pointer case, I should definitely add that)

The purpose of this function is to transmit raw bytes to the C side. So matching on the type at all here is almost certainly not the right call.

Ah I think I see what you are doing here... you are changing the bytes if they represent a function pointer. That doesn't work, we need the bytes to be exactly the same on both sides. After all, both sides might be casting those pointers to integers and print them, and something would be seriously cursed if the same function pointer then printed as different addresses on the two sides of the FFI call. Also, if you store a function pointer in a static and then pass a pointer to that static to C, this function will never see the function pointer, it will only ever see the pointer to the function pointer.

The code that determines the addresses of allocations (which includes both data allocations for normal static/stack/heap memory, and function allocations for the "memory that a function pointer points to") is here:

miri/src/alloc_addresses/mod.rs

Lines 139 to 178 in 5a47c25

// In native lib mode, we use the "real" address of the bytes for this allocation.

// This ensures the interpreted program and native code have the same view of memory.

let params = this.machine.get_default_alloc_params();

let base_ptr = match info.kind {

AllocKind::LiveData => {

if memory_kind == MiriMemoryKind::Global.into() {

// For new global allocations, we always pre-allocate the memory to be able use the machine address directly.

let prepared_bytes = MiriAllocBytes::zeroed(info.size, info.align, params)

.unwrap_or_else(|| {

panic!("Miri ran out of memory: cannot create allocation of {size:?} bytes", size = info.size)

});

let ptr = prepared_bytes.as_ptr();

// Store prepared allocation to be picked up for use later.

global_state

.prepared_alloc_bytes

.as_mut()

.unwrap()

.try_insert(alloc_id, prepared_bytes)

.unwrap();

ptr

} else {

// Non-global allocations are already in memory at this point so

// we can just get a pointer to where their data is stored.

this.get_alloc_bytes_unchecked_raw(alloc_id)?

}

}

AllocKind::Function | AllocKind::VTable => {

// Allocate some dummy memory to get a unique address for this function/vtable.

let alloc_bytes = MiriAllocBytes::from_bytes(

&[0u8; 1],

Align::from_bytes(1).unwrap(),

params,

);

let ptr = alloc_bytes.as_ptr();

// Leak the underlying memory to ensure it remains unique.

std::mem::forget(alloc_bytes);

ptr

}

AllocKind::TypeId | AllocKind::Dead => unreachable!(),

};

The FFI closure allocation needs to happen in the AllocKind::Function case there, so that we can then make the "virtual" address (i.e., the address in Miri's purely logical interpreter memory) of this function pointer the same as the real address of the closure. This is a key invariant of Miri in native_lib mode: Miri's logical/virtual addresses are the same as the real underlying addresses, and therefore C code can follow pointers stored in Miri memory and everything works out.

Thanks for the pointer. This sounds reasonable, but I fail to see how I would access information about the callback/function/allocation type there. Without this information it's not possible to construct the corresponding libffi type.

There is get_fn_alloc but it's private (in compiler/rustc_const_eval/src/interpret/memory.rs in the rustc tree). We should probably make it public, and then also rename it to try_get_alloc_fn for better consistency with other, similar methods.

This should hopefully be addressed now with 9ccb197. Thanks for pointing to the right locations, it looks much robuster now 🎉

oli-obk · 2025-11-28T07:33:39Z

Please add a test for this, too.

weiznich · 2025-11-28T07:53:00Z

Please add a test for this, too.

I'm more than happy to do that, but as pointed out in the OP I don't know where and how. Is there any documentation for this or can you provide a pointer to the right direction?

RalfJung · 2025-11-28T09:51:35Z

src/shims/native_lib/mod.rs

+            // Functions with no declared return type (i.e., the default return)
+            // have the output_type `Tuple([])`.
+            ty::Tuple(t_list) if (*t_list).deref().is_empty() => FfiType::void(),


This should not be needed -- function pointers are scalar types.

The function I happened to use for testing returned void. As this function is now also called to construct the return value types for the libffi closure type this was required to make my test case working.

That written: You are correct that this is not needed for a minimal support of callback over fro, it just happens to help with my particular test case and was rather straightforward to add.

this function is now also called to construct the return value types for the libffi closure type

Ah, that makes sense.

Given that I think we should never return from that closure (see my other comments), I think we should not need this either.

This is still required as we need the FfiType while constructing the libffi Closure type to get the correct ABI as far as I understand.

Could handle this separately in another PR and make all tests here have an already implemented return type. But not too important

I can do this, but in that case I likely also need to change existing tests to make it work with the implemented feature as they also use void as return/argument type for the callback.

RalfJung · 2025-11-28T09:53:18Z

There are no tests for this yet, as I'm not familiar with miri's test setup, how to structure them and where to put them. Any pointers for this would be helpful.

The native-lib tests are in tests/native-lib. You can just add new .rs files inside pass or fail there and they will be picked up by the test suite. Use ./miri test to run the entire test suite; ./miri test native runs only tests whose name contains "native" which gets you the relevant results for this PR more quickly. See https://github.com/rust-lang/miri/blob/master/CONTRIBUTING.md#building-and-testing-miri for more details.

RalfJung · 2025-11-28T09:57:21Z

src/shims/native_lib/mod.rs

+) {
+    debug_assert_eq!(cif.nargs as usize, infos.args.len());
+    let mut rust_args = Vec::with_capacity(infos.args.len());
+    // cast away the pointer to pointer


As a general style note, please use full sentences in comments, including an upper-case first word and a period at the end.

RalfJung · 2025-11-28T09:57:58Z

src/shims/native_lib/mod.rs

+    // write here the output
+    // For now we just try to write some dummy output
+    // by using some "reasonable" default values
+    // to prevent crashing


I don't think we should return to the FFI code here. That code clearly wanted this function pointer to do something, and arbitrary nonsense could happen if we just skip whatever that is and return. This could be worse than a segfault. We have to abort execution here.

This moves the creation of the closure to a more meaningful location. Additionally it adds a test + aborts the execution as soon as the closure is called as we currently cannot return a reasonable value there.

weiznich · 2025-12-06T11:26:26Z

src/shims/native_lib/mod.rs

+    eprintln!(
+        "Tried to call a function pointer via FFI boundary. \
+         That's not supported yet by miri\nThis function pointer was registered by a call to `{}` \
+         using an argument of the type `{}`",
+        "todo: fill in name", "todo: fill in type"
+    );


This is the place I'm currently rather unhappy with, but I don't know how to solve better.

I would like to emit a proper "error" here and not just print something. Is there a easy way to do this?

Also I would like to include the name of the function that registered the closure and the type of the argument, but I'm not sure how I get that information in src/alloc_address/mod.rs.

Maybe you have some pointer for both problems as well?

Also I expect that you want to tweak this message to something better, so consider that a placeholder until someone suggest a better wording.

Via the InterpCx you should be able to report an error, but we def still need the exit as there is no way to leave this FFI code and go back to the original Rust/miri code

The problem will be passing in the InterpCx. We can not just use the this that was passed to build_libffi_closure -- that reference gets invalidated when build_libffi_closure returns.

We need some way to pass this value from call_native_with_args to the callback that gets invoked while this native code runs.

Would it be possible to just put the context in a (thread local) global variable just before calling a native function and take it from there in the callback?

That seems like the easiest option. The closure may find itself called on a different thread, at which point we have to just hard abort without a nice error, but at least for the common case this should work (and anyway if the C code spawns threads that interact in any way with Rust code, this native code integration is pretty much completely broken).

weiznich · 2025-12-06T11:27:13Z

tests/native-lib/fail/call_function_ptr.rs

+
+    unsafe {
+        call_fn_ptr(None); // this one is fine
+        call_fn_ptr(Some(nop)); // this one is not


As of now this test fails, even as it emits the "right" output as it doesn't emit an actual error, but just a eprintln + non-zero exist code.

weiznich · 2025-12-06T11:29:12Z

src/shims/native_lib/mod.rs

+    for arg in &infos.args {
+        let scalar = match arg.kind() {
+            ty::RawPtr(..) => {
+                let ptr = StrictPointer::new(Provenance::Wildcard, Size::from_bytes(args.addr()));
+                // This offset moves the pointer to the next argument
+                args = unsafe { args.offset(1) };
+                Scalar::from_pointer(ptr, infos.this)
+            }
+            // the other types
+            _ => todo!(),
+        };
+        rust_args.push(scalar);
+    }


Not sure if we should remove this block as well or fill in more types instead? For now this is unused but as soon as this callback starts actually to continue the execution of the rust code here, this would be required as first step.

weiznich · 2025-12-06T11:30:30Z

@rustbot ready

(Not really ready as in ready to merge, but more ready as in this is waiting on reviewer input for now)

oli-obk · 2025-12-06T12:43:29Z

src/shims/native_lib/mod.rs

+            // Functions with no declared return type (i.e., the default return)
+            // have the output_type `Tuple([])`.
+            ty::Tuple(t_list) if (*t_list).deref().is_empty() => FfiType::void(),


Could handle this separately in another PR and make all tests here have an already implemented return type. But not too important

oli-obk · 2025-12-06T12:46:46Z

src/alloc_addresses/mod.rs

+                    let mut ptr = alloc_bytes.as_ptr();
                    // Leak the underlying memory to ensure it remains unique.
                    std::mem::forget(alloc_bytes);
+                    if let Some(GlobalAlloc::Function { instance, .. }) =


The above code is irrelevant for this code path. Split the function code path from the vtable code path and unwrap the function global alloc as all AllocKind::Function should be backed by a GlobalAlloc::Function I think

oli-obk · 2025-12-06T12:49:48Z

src/shims/native_lib/mod.rs

+    eprintln!(
+        "Tried to call a function pointer via FFI boundary. \
+         That's not supported yet by miri\nThis function pointer was registered by a call to `{}` \
+         using an argument of the type `{}`",
+        "todo: fill in name", "todo: fill in type"
+    );


Via the InterpCx you should be able to report an error, but we def still need the exit as there is no way to leave this FFI code and go back to the original Rust/miri code

rustbot · 2025-12-06T12:49:56Z

Reminder, once the PR becomes ready for a review, use @rustbot ready.

This commit brings better error reporting and some additional cleanups.

weiznich · 2025-12-06T17:43:44Z

@rustbot ready

rustbot added the S-waiting-on-review Status: Waiting for a review to complete label Nov 27, 2025

This comment has been minimized.

Sign in to view

weiznich force-pushed the ffi_callbacks branch from e553417 to b539d51 Compare November 27, 2025 18:29

This comment has been minimized.

Sign in to view

weiznich force-pushed the ffi_callbacks branch from b539d51 to 6699145 Compare November 27, 2025 18:30

weiznich mentioned this pull request Nov 27, 2025

native-lib mode segfaults when calling Rust function pointer from C #4639

Open

oli-obk reviewed Nov 28, 2025

View reviewed changes

RalfJung reviewed Nov 28, 2025

View reviewed changes

RalfJung added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Nov 28, 2025

Address review comments

a464170

This moves the creation of the closure to a more meaningful location. Additionally it adds a test + aborts the execution as soon as the closure is called as we currently cannot return a reasonable value there.

weiznich force-pushed the ffi_callbacks branch from 9ccb197 to a464170 Compare December 6, 2025 11:24

weiznich commented Dec 6, 2025

View reviewed changes

rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Dec 6, 2025

oli-obk requested changes Dec 6, 2025

View reviewed changes

rustbot added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Dec 6, 2025

Additional improvements

3cb1af6

This commit brings better error reporting and some additional cleanups.

weiznich force-pushed the ffi_callbacks branch from 17ab519 to 3cb1af6 Compare December 6, 2025 17:16

rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Dec 6, 2025

	// In native lib mode, we use the "real" address of the bytes for this allocation.
	// This ensures the interpreted program and native code have the same view of memory.
	let params = this.machine.get_default_alloc_params();
	let base_ptr = match info.kind {
	AllocKind::LiveData => {
	if memory_kind == MiriMemoryKind::Global.into() {
	// For new global allocations, we always pre-allocate the memory to be able use the machine address directly.
	let prepared_bytes = MiriAllocBytes::zeroed(info.size, info.align, params)
	.unwrap_or_else(\|\| {
	panic!("Miri ran out of memory: cannot create allocation of {size:?} bytes", size = info.size)
	});
	let ptr = prepared_bytes.as_ptr();
	// Store prepared allocation to be picked up for use later.
	global_state
	.prepared_alloc_bytes
	.as_mut()
	.unwrap()
	.try_insert(alloc_id, prepared_bytes)
	.unwrap();
	ptr
	} else {
	// Non-global allocations are already in memory at this point so
	// we can just get a pointer to where their data is stored.
	this.get_alloc_bytes_unchecked_raw(alloc_id)?
	}
	}
	AllocKind::Function \| AllocKind::VTable => {
	// Allocate some dummy memory to get a unique address for this function/vtable.
	let alloc_bytes = MiriAllocBytes::from_bytes(
	&[0u8; 1],
	Align::from_bytes(1).unwrap(),
	params,
	);
	let ptr = alloc_bytes.as_ptr();
	// Leak the underlying memory to ensure it remains unique.
	std::mem::forget(alloc_bytes);
	ptr
	}
	AllocKind::TypeId \| AllocKind::Dead => unreachable!(),
	};

Basic support for FFI callbacks (at least stop segfaulting and report a proper error) #4728

Are you sure you want to change the base?

Basic support for FFI callbacks (at least stop segfaulting and report a proper error) #4728

Conversation

weiznich commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Nov 27, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RalfJung Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oli-obk commented Nov 28, 2025

Uh oh!

weiznich commented Nov 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RalfJung commented Nov 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RalfJung Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RalfJung Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weiznich commented Dec 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rustbot commented Dec 6, 2025

Uh oh!

weiznich commented Dec 6, 2025

Uh oh!

Reviewers

weiznich commented Nov 27, 2025 •

edited

Loading

RalfJung Nov 28, 2025 •

edited

Loading

RalfJung Nov 28, 2025 •

edited

Loading

RalfJung Dec 6, 2025 •

edited

Loading