Skip to content

Commit 5a5c984

Browse files
committed
materialized: attempt to produce nice backtraces on stack overflow
Rust produces bad error messages on stack overflow, like "thread 'foo' has overflowed its stack" which provides very little insight into where the recursion that caused the stack to overflow occurred. See rust-lang/rust#51405 for details. This commit adds a SIGSEGV handler that attempts to print a backtrace, following the approach in the backtrace-on-stack-overflow crate. I copied the code from that crate into Materialize and tweaked it because it's a very small amount of code that we'll likely need to modify, and I wanted to improve its error handling. In my manual testing this produces a nice backtrace when Materialize overflows its stack.
1 parent 6a2c77a commit 5a5c984

File tree

4 files changed

+72
-2
lines changed

4 files changed

+72
-2
lines changed

Cargo.lock

Lines changed: 4 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/materialized/Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,10 @@ include_dir = "0.6.0"
4848
itertools = "0.9.0"
4949
krb5-src = { version = "0.2.3", features = ["binaries"] }
5050
lazy_static = "1.4.0"
51+
libc = "0.2.84"
5152
log = "0.4.13"
5253
mz-process-collector = { path = "../mz-process-collector" }
54+
nix = "0.19.1"
5355
num_cpus = "1.0.0"
5456
openssl = { version = "0.10.32", features = ["vendored"] }
5557
openssl-sys = { version = "0.9.59", features = ["vendored"] }

src/materialized/src/bin/materialized/main.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@ fn main() {
213213

214214
fn run(args: Args) -> Result<(), anyhow::Error> {
215215
panic::set_hook(Box::new(handle_panic));
216+
sys::enable_sigsegv_backtraces()?;
216217

217218
if args.version > 0 {
218219
println!("materialized {}", materialized::BUILD_INFO.human_version());

src/materialized/src/bin/materialized/sys.rs

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,13 @@
99

1010
//! System support functions.
1111
12+
use std::alloc::{self, Layout};
13+
use std::ptr;
14+
15+
use anyhow::{bail, Context};
1216
use log::{trace, warn};
17+
use nix::errno;
18+
use nix::sys::signal;
1319

1420
#[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "ios")))]
1521
pub fn adjust_rlimits() {
@@ -90,3 +96,62 @@ pub fn adjust_rlimits() {
9096
)
9197
}
9298
}
99+
100+
/// Attempts to enable backtraces when SIGSEGV occurs.
101+
///
102+
/// In particular, this means producing backtraces on stack overflow, as
103+
/// stack overflow raises SIGSEGV. The approach here involves making system
104+
/// calls to handle SIGSEGV on an alternate signal stack, which seems to work
105+
/// well in practice but may technically be undefined behavior.
106+
///
107+
/// Rust may someday do this by default.
108+
/// Follow: https://github.com/rust-lang/rust/issues/51405.
109+
pub fn enable_sigsegv_backtraces() -> Result<(), anyhow::Error> {
110+
// This code is derived from the code in the backtrace-on-stack-overflow
111+
// crate, which is freely available under the terms of the Apache 2.0
112+
// license. The modifications here provide better error messages if any of
113+
// the various system calls fail.
114+
//
115+
// See: https://github.com/matklad/backtrace-on-stack-overflow
116+
117+
// Chosen to match the default Rust thread stack size of 2MiB. Probably
118+
// overkill, but we'd much rather have backtraces on stack overflow than
119+
// squabble over a few megabytes.
120+
const STACK_SIZE: usize = 2 << 20;
121+
122+
// x86_64 requires 16-byte alignment. Hopefully other platforms don't
123+
// have more stringent requirements.
124+
const STACK_ALIGN: usize = 16;
125+
126+
// Allocate a stack.
127+
let buf_layout =
128+
Layout::from_size_align(STACK_SIZE, STACK_ALIGN).expect("layout known to be valid");
129+
let buf = unsafe { alloc::alloc(buf_layout) };
130+
131+
// Request that signals be delivered to this alternate stack.
132+
let stack = libc::stack_t {
133+
ss_sp: buf as *mut libc::c_void,
134+
ss_flags: 0,
135+
ss_size: STACK_SIZE,
136+
};
137+
let ret = unsafe { libc::sigaltstack(&stack, ptr::null_mut()) };
138+
if ret == -1 {
139+
let errno = errno::from_i32(errno::errno());
140+
bail!("failed to configure alternate signal stack: {}", errno);
141+
}
142+
143+
// Install a handler for SIGSEGV.
144+
let action = signal::SigAction::new(
145+
signal::SigHandler::Handler(handle_sigsegv),
146+
signal::SaFlags::SA_NODEFER | signal::SaFlags::SA_ONSTACK,
147+
signal::SigSet::empty(),
148+
);
149+
unsafe { signal::sigaction(signal::SIGSEGV, &action) }
150+
.context("failed to install SIGSEGV handler")?;
151+
152+
Ok(())
153+
}
154+
155+
extern "C" fn handle_sigsegv(_: i32) {
156+
panic!("stack overflow");
157+
}

0 commit comments

Comments
 (0)