Skip to content

Commit d565c74

Browse files
committed
Auto merge of #81858 - ijackson:fork-no-unwind, r=m-ou-se
Do not allocate or unwind after fork ### Objective scenarios * Make (simple) panics safe in `Command::pre_exec_hook`, including most `panic!` calls, `Option::unwrap`, and array bounds check failures. * Make it possible to `libc::fork` and then safely panic in the child (needed for the above, but this requirement means exposing the new raw hook API which the `Command` implementation needs). * In singlethreaded programs, where panic in `pre_exec_hook` is already memory-safe, prevent the double-unwinding malfunction #79740. I think we want to make panic after fork safe even though the post-fork child environment is only experienced by users of `unsafe`, beause the subset of Rust in which any panic is UB is really far too hazardous and unnatural. #### Approach * Provide a way for a program to, at runtime, switch to having panics abort. This makes it possible to panic without making *any* heap allocations, which is needed because on some platforms malloc is UB in a child forked from a multithreaded program (see #80263 (comment), and maybe also the SuS [spec](https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html)). * Make that change in the child spawned by `Command`. * Document the rules comprehensively enough that a programmer has a fighting chance of writing correct code. * Test that this all works as expected (and in particular, that there aren't any heap allocations we missed) Fixes #79740 #### Rejected (or previously attempted) approaches * Change the panic machinery to be able to unwind without allocating, at least when the payload and message are both `'static`. This seems like it would be even more subtle. Also that is a potentially-hot path which I don't want to mess with. * Change the existing panic hook mechanism to not convert the message to a `String` before calling the hook. This would be a surprising change for existing code and would not be detected by the type system. * Provide a `raw_panic_hook` function to intercept panics in a way that doesn't allocate. (That was an earlier version of this MR.) ### History This MR could be considered a v2 of #80263. Thanks to everyone who commented there. In particular, thanks to `@m-ou-se,` `@Mark-Simulacrum` and `@hyd-dev.` (Tagging you since I think you might be interested in this new MR.) Compared to #80263, this MR has very substantial changes and additions. Additionally, I have recently (2021-04-20) completely revised this series following very helpful comments from `@m-ou-se.` r? `@m-ou-se`
2 parents 8cf990c + 88ccaa7 commit d565c74

File tree

7 files changed

+334
-40
lines changed

7 files changed

+334
-40
lines changed

library/std/src/os/unix/process.rs

+6
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,12 @@ pub trait CommandExt: Sealed {
7575
/// sure that the closure does not violate library invariants by making
7676
/// invalid use of these duplicates.
7777
///
78+
/// Panicking in the closure is safe only if all the format arguments for the
79+
/// panic message can be safely formatted; this is because although
80+
/// `Command` calls [`std::panic::always_abort`](crate::panic::always_abort)
81+
/// before calling the pre_exec hook, panic will still try to format the
82+
/// panic message.
83+
///
7884
/// When this closure is run, aspects such as the stdio file descriptors and
7985
/// working directory have successfully been changed, so output to these
8086
/// locations may not appear where intended.

library/std/src/panic.rs

+36
Original file line numberDiff line numberDiff line change
@@ -463,5 +463,41 @@ pub fn resume_unwind(payload: Box<dyn Any + Send>) -> ! {
463463
panicking::rust_panic_without_hook(payload)
464464
}
465465

466+
/// Make all future panics abort directly without running the panic hook or unwinding.
467+
///
468+
/// There is no way to undo this; the effect lasts until the process exits or
469+
/// execs (or the equivalent).
470+
///
471+
/// # Use after fork
472+
///
473+
/// This function is particularly useful for calling after `libc::fork`. After `fork`, in a
474+
/// multithreaded program it is (on many platforms) not safe to call the allocator. It is also
475+
/// generally highly undesirable for an unwind to unwind past the `fork`, because that results in
476+
/// the unwind propagating to code that was only ever expecting to run in the parent.
477+
///
478+
/// `panic::always_abort()` helps avoid both of these. It directly avoids any further unwinding,
479+
/// and if there is a panic, the abort will occur without allocating provided that the arguments to
480+
/// panic can be formatted without allocating.
481+
///
482+
/// Examples
483+
///
484+
/// ```no_run
485+
/// #![feature(panic_always_abort)]
486+
/// use std::panic;
487+
///
488+
/// panic::always_abort();
489+
///
490+
/// let _ = panic::catch_unwind(|| {
491+
/// panic!("inside the catch");
492+
/// });
493+
///
494+
/// // We will have aborted already, due to the panic.
495+
/// unreachable!();
496+
/// ```
497+
#[unstable(feature = "panic_always_abort", issue = "84438")]
498+
pub fn always_abort() {
499+
crate::panicking::panic_count::set_always_abort();
500+
}
501+
466502
#[cfg(test)]
467503
mod tests;

library/std/src/panicking.rs

+51-17
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,7 @@ pub fn take_hook() -> Box<dyn Fn(&PanicInfo<'_>) + 'static + Sync + Send> {
180180
fn default_hook(info: &PanicInfo<'_>) {
181181
// If this is a double panic, make sure that we print a backtrace
182182
// for this panic. Otherwise only print it if logging is enabled.
183-
let backtrace_env = if panic_count::get() >= 2 {
183+
let backtrace_env = if panic_count::get_count() >= 2 {
184184
RustBacktrace::Print(crate::backtrace_rs::PrintFmt::Full)
185185
} else {
186186
backtrace::rust_backtrace_env()
@@ -233,6 +233,8 @@ pub mod panic_count {
233233
use crate::cell::Cell;
234234
use crate::sync::atomic::{AtomicUsize, Ordering};
235235

236+
pub const ALWAYS_ABORT_FLAG: usize = 1 << (usize::BITS - 1);
237+
236238
// Panic count for the current thread.
237239
thread_local! { static LOCAL_PANIC_COUNT: Cell<usize> = Cell::new(0) }
238240

@@ -241,33 +243,53 @@ pub mod panic_count {
241243
// thread, if that thread currently views `GLOBAL_PANIC_COUNT` as being zero,
242244
// then `LOCAL_PANIC_COUNT` in that thread is zero. This invariant holds before
243245
// and after increase and decrease, but not necessarily during their execution.
246+
//
247+
// Additionally, the top bit of GLOBAL_PANIC_COUNT (GLOBAL_ALWAYS_ABORT_FLAG)
248+
// records whether panic::always_abort() has been called. This can only be
249+
// set, never cleared.
250+
//
251+
// This could be viewed as a struct containing a single bit and an n-1-bit
252+
// value, but if we wrote it like that it would be more than a single word,
253+
// and even a newtype around usize would be clumsy because we need atomics.
254+
// But we use such a tuple for the return type of increase().
255+
//
256+
// Stealing a bit is fine because it just amounts to assuming that each
257+
// panicking thread consumes at least 2 bytes of address space.
244258
static GLOBAL_PANIC_COUNT: AtomicUsize = AtomicUsize::new(0);
245259

246-
pub fn increase() -> usize {
247-
GLOBAL_PANIC_COUNT.fetch_add(1, Ordering::Relaxed);
248-
LOCAL_PANIC_COUNT.with(|c| {
249-
let next = c.get() + 1;
250-
c.set(next);
251-
next
252-
})
260+
pub fn increase() -> (bool, usize) {
261+
(
262+
GLOBAL_PANIC_COUNT.fetch_add(1, Ordering::Relaxed) & ALWAYS_ABORT_FLAG != 0,
263+
LOCAL_PANIC_COUNT.with(|c| {
264+
let next = c.get() + 1;
265+
c.set(next);
266+
next
267+
}),
268+
)
253269
}
254270

255-
pub fn decrease() -> usize {
271+
pub fn decrease() {
256272
GLOBAL_PANIC_COUNT.fetch_sub(1, Ordering::Relaxed);
257273
LOCAL_PANIC_COUNT.with(|c| {
258274
let next = c.get() - 1;
259275
c.set(next);
260276
next
261-
})
277+
});
278+
}
279+
280+
pub fn set_always_abort() {
281+
GLOBAL_PANIC_COUNT.fetch_or(ALWAYS_ABORT_FLAG, Ordering::Relaxed);
262282
}
263283

264-
pub fn get() -> usize {
284+
// Disregards ALWAYS_ABORT_FLAG
285+
pub fn get_count() -> usize {
265286
LOCAL_PANIC_COUNT.with(|c| c.get())
266287
}
267288

289+
// Disregards ALWAYS_ABORT_FLAG
268290
#[inline]
269-
pub fn is_zero() -> bool {
270-
if GLOBAL_PANIC_COUNT.load(Ordering::Relaxed) == 0 {
291+
pub fn count_is_zero() -> bool {
292+
if GLOBAL_PANIC_COUNT.load(Ordering::Relaxed) & !ALWAYS_ABORT_FLAG == 0 {
271293
// Fast path: if `GLOBAL_PANIC_COUNT` is zero, all threads
272294
// (including the current one) will have `LOCAL_PANIC_COUNT`
273295
// equal to zero, so TLS access can be avoided.
@@ -410,7 +432,7 @@ pub unsafe fn r#try<R, F: FnOnce() -> R>(f: F) -> Result<R, Box<dyn Any + Send>>
410432
/// Determines whether the current thread is unwinding because of panic.
411433
#[inline]
412434
pub fn panicking() -> bool {
413-
!panic_count::is_zero()
435+
!panic_count::count_is_zero()
414436
}
415437

416438
/// The entry point for panicking with a formatted message.
@@ -563,15 +585,27 @@ fn rust_panic_with_hook(
563585
message: Option<&fmt::Arguments<'_>>,
564586
location: &Location<'_>,
565587
) -> ! {
566-
let panics = panic_count::increase();
588+
let (must_abort, panics) = panic_count::increase();
567589

568590
// If this is the third nested call (e.g., panics == 2, this is 0-indexed),
569591
// the panic hook probably triggered the last panic, otherwise the
570592
// double-panic check would have aborted the process. In this case abort the
571593
// process real quickly as we don't want to try calling it again as it'll
572594
// probably just panic again.
573-
if panics > 2 {
574-
util::dumb_print(format_args!("thread panicked while processing panic. aborting.\n"));
595+
if must_abort || panics > 2 {
596+
if panics > 2 {
597+
// Don't try to print the message in this case
598+
// - perhaps that is causing the recursive panics.
599+
util::dumb_print(format_args!("thread panicked while processing panic. aborting.\n"));
600+
} else {
601+
// Unfortunately, this does not print a backtrace, because creating
602+
// a `Backtrace` will allocate, which we must to avoid here.
603+
let panicinfo = PanicInfo::internal_constructor(message, location);
604+
util::dumb_print(format_args!(
605+
"{}\npanicked after panic::always_abort(), aborting.\n",
606+
panicinfo
607+
));
608+
}
575609
intrinsics::abort()
576610
}
577611

library/std/src/sys/unix/process/process_unix.rs

+1
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ impl Command {
5454
let (env_lock, pid) = unsafe { (sys::os::env_read_lock(), cvt(libc::fork())?) };
5555

5656
if pid == 0 {
57+
crate::panic::always_abort();
5758
mem::forget(env_lock);
5859
drop(input);
5960
let Err(err) = unsafe { self.do_exec(theirs, envp.as_ref()) };

library/std/src/sys/unix/process/process_unix/tests.rs

+27
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,10 @@
1+
use crate::os::unix::process::{CommandExt, ExitStatusExt};
2+
use crate::panic::catch_unwind;
3+
use crate::process::Command;
4+
5+
// Many of the other aspects of this situation, including heap alloc concurrency
6+
// safety etc., are tested in src/test/ui/process/process-panic-after-fork.rs
7+
18
#[test]
29
fn exitstatus_display_tests() {
310
// In practice this is the same on every Unix.
@@ -28,3 +35,23 @@ fn exitstatus_display_tests() {
2835
t(0x000ff, "unrecognised wait status: 255 0xff");
2936
}
3037
}
38+
39+
#[test]
40+
#[cfg_attr(target_os = "emscripten", ignore)]
41+
fn test_command_fork_no_unwind() {
42+
let got = catch_unwind(|| {
43+
let mut c = Command::new("echo");
44+
c.arg("hi");
45+
unsafe {
46+
c.pre_exec(|| panic!("{}", "crash now!"));
47+
}
48+
let st = c.status().expect("failed to get command status");
49+
dbg!(st);
50+
st
51+
});
52+
dbg!(&got);
53+
let status = got.expect("panic unexpectedly propagated");
54+
dbg!(status);
55+
let signal = status.signal().expect("expected child process to die of signal");
56+
assert!(signal == libc::SIGABRT || signal == libc::SIGILL || signal == libc::SIGTRAP);
57+
}

src/test/ui/panics/abort-on-panic.rs

+62-23
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
#![allow(unused_must_use)]
44
#![feature(unwind_attributes)]
5+
#![feature(panic_always_abort)]
56
// Since we mark some ABIs as "nounwind" to LLVM, we must make sure that
67
// we never unwind through them.
78

@@ -11,7 +12,9 @@
1112
use std::{env, panic};
1213
use std::io::prelude::*;
1314
use std::io;
14-
use std::process::{Command, Stdio};
15+
use std::process::{exit, Command, Stdio};
16+
use std::sync::{Arc, Barrier};
17+
use std::thread;
1518

1619
#[unwind(aborts)] // FIXME(#58794) should work even without the attribute
1720
extern "C" fn panic_in_ffi() {
@@ -23,41 +26,77 @@ extern "Rust" fn panic_in_rust_abi() {
2326
panic!("TestRust");
2427
}
2528

26-
fn test() {
27-
let _ = panic::catch_unwind(|| { panic_in_ffi(); });
28-
// The process should have aborted by now.
29+
fn should_have_aborted() {
2930
io::stdout().write(b"This should never be printed.\n");
3031
let _ = io::stdout().flush();
3132
}
3233

34+
fn bomb_out_but_not_abort(msg: &str) {
35+
eprintln!("bombing out: {}", msg);
36+
exit(1);
37+
}
38+
39+
fn test() {
40+
let _ = panic::catch_unwind(|| { panic_in_ffi(); });
41+
should_have_aborted();
42+
}
43+
3344
fn testrust() {
3445
let _ = panic::catch_unwind(|| { panic_in_rust_abi(); });
35-
// The process should have aborted by now.
36-
io::stdout().write(b"This should never be printed.\n");
37-
let _ = io::stdout().flush();
46+
should_have_aborted();
47+
}
48+
49+
fn test_always_abort() {
50+
panic::always_abort();
51+
let _ = panic::catch_unwind(|| { panic!(); });
52+
should_have_aborted();
53+
}
54+
55+
fn test_always_abort_thread() {
56+
let barrier = Arc::new(Barrier::new(2));
57+
let thr = {
58+
let barrier = barrier.clone();
59+
thread::spawn(move ||{
60+
barrier.wait();
61+
panic!("in thread");
62+
})
63+
};
64+
panic::always_abort();
65+
barrier.wait();
66+
let _ = thr.join();
67+
bomb_out_but_not_abort("joined - but we were supposed to panic!");
3868
}
3969

4070
fn main() {
71+
let tests: &[(_, fn())] = &[
72+
("test", test),
73+
("testrust", testrust),
74+
("test_always_abort", test_always_abort),
75+
("test_always_abort_thread", test_always_abort_thread),
76+
];
77+
4178
let args: Vec<String> = env::args().collect();
4279
if args.len() > 1 {
4380
// This is inside the self-executed command.
44-
match &*args[1] {
45-
"test" => return test(),
46-
"testrust" => return testrust(),
47-
_ => panic!("bad test"),
81+
for (a,f) in tests {
82+
if &args[1] == a { return f() }
4883
}
84+
bomb_out_but_not_abort("bad test");
4985
}
5086

51-
// These end up calling the self-execution branches above.
52-
let mut p = Command::new(&args[0])
53-
.stdout(Stdio::piped())
54-
.stdin(Stdio::piped())
55-
.arg("test").spawn().unwrap();
56-
assert!(!p.wait().unwrap().success());
57-
58-
let mut p = Command::new(&args[0])
59-
.stdout(Stdio::piped())
60-
.stdin(Stdio::piped())
61-
.arg("testrust").spawn().unwrap();
62-
assert!(!p.wait().unwrap().success());
87+
let execute_self_expecting_abort = |arg| {
88+
let mut p = Command::new(&args[0])
89+
.stdout(Stdio::piped())
90+
.stdin(Stdio::piped())
91+
.arg(arg).spawn().unwrap();
92+
let status = p.wait().unwrap();
93+
assert!(!status.success());
94+
// Any reasonable platform can distinguish a process which
95+
// called exit(1) from one which panicked.
96+
assert_ne!(status.code(), Some(1));
97+
};
98+
99+
for (a,_f) in tests {
100+
execute_self_expecting_abort(a);
101+
}
63102
}

0 commit comments

Comments
 (0)