-
Notifications
You must be signed in to change notification settings - Fork 805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix multithreaded access to freelist pyclasses #4902
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, gave this another pass with fresh eyes quickly this morning.
if let Some(obj) = T::get_free_list(Python::assume_gil_acquired()).insert(obj) { | ||
let mut free_list = T::get_free_list(Python::assume_gil_acquired()) | ||
.lock() | ||
.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that the unwrap()
here could produce a panic. We might need to install panic traps. Is there ever a way this lock gets poisoned? (similar in alloc function). 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I manually insert a panic on the line below it does indeed crash Python if I modify one of the examples to use a freelist pyclass:
tests/test_getitem.py Got a slice! 3-5, step: 1, value: 2
Got an index! 2 : value: 2
thread '<unnamed>' panicked at /Users/goldbaum/Documents/pyo3/src/impl_/pyclass.rs:962:5:
panic in freelist free
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at core/src/panicking.rs:223:5:
panic in a function that cannot unwind
stack backtrace:
0: 0x103eb9268 - std::backtrace_rs::backtrace::libunwind::trace::h4965e0a7b4ac11d7
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/../../backtrace/src/backtrace/libunwind.rs:116:5
1: 0x103eb9268 - std::backtrace_rs::backtrace::trace_unsynchronized::hf4fa2da75bbd5d09
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x103eb9268 - std::sys::backtrace::_print_fmt::h75773692a17404c8
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/sys/backtrace.rs:66:9
3: 0x103eb9268 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h39ba3129e355bb22
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/sys/backtrace.rs:39:26
4: 0x103ecd278 - core::fmt::rt::Argument::fmt::h34f25d464889fcc7
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/fmt/rt.rs:177:76
5: 0x103ecd278 - core::fmt::write::h8b50d3a0f616451a
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/fmt/mod.rs:1189:21
6: 0x103eb7740 - std::io::Write::write_fmt::h4b3bbae7048e35f8
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/io/mod.rs:1884:15
7: 0x103eb911c - std::sys::backtrace::BacktraceLock::print::h7934b1e389160086
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/sys/backtrace.rs:42:9
8: 0x103eba068 - std::panicking::default_hook::{{closure}}::hbcd636b20f603d1e
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:268:22
9: 0x103eb9e9c - std::panicking::default_hook::ha9081970ba26bc6c
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:295:9
10: 0x103eba858 - std::panicking::rust_panic_with_hook::h9a5dc30b684e2ff4
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:801:13
11: 0x103eba4a8 - std::panicking::begin_panic_handler::{{closure}}::hbcb5de8b840ae91c
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:667:13
12: 0x103eb972c - std::sys::backtrace::__rust_end_short_backtrace::ha657d4b4d65dc993
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/sys/backtrace.rs:170:18
13: 0x103eba188 - rust_begin_unwind
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:665:5
14: 0x103ed574c - core::panicking::panic_nounwind_fmt::runtime::h13e8a6e8075ea543
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/panicking.rs:119:22
15: 0x103ed574c - core::panicking::panic_nounwind_fmt::h4a10ecea0e21f67a
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/intrinsics/mod.rs:3535:9
16: 0x103ed57c4 - core::panicking::panic_nounwind::ha9a59379b5f3f41a
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/panicking.rs:223:5
17: 0x103ed58f4 - core::panicking::panic_cannot_unwind::h1bb1158913507f0a
at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/panicking.rs:315:5
18: 0x103e6be70 - pyo3::impl_::pyclass::free_with_freelist::h4db796234226fcb0
at /Users/goldbaum/Documents/pyo3/src/impl_/pyclass.rs:953:1
19: 0x103e67bd4 - <pyo3::pycell::impl_::PyClassObjectBase<U> as pyo3::pycell::impl_::PyClassObjectLayout<T>>::tp_dealloc::hd994ca52f6b1b031
at /Users/goldbaum/Documents/pyo3/src/pycell/impl_.rs:248:20
20: 0x103e67978 - <pyo3::pycell::impl_::PyClassObject<T> as pyo3::pycell::impl_::PyClassObjectLayout<T>>::tp_dealloc::h639038ac314c91b3
at /Users/goldbaum/Documents/pyo3/src/pycell/impl_.rs:353:9
21: 0x103e693ec - pyo3::impl_::trampoline::dealloc::{{closure}}::h48d3ff4ddbf5ef19
at /Users/goldbaum/Documents/pyo3/src/impl_/trampoline.rs:151:13
22: 0x103e693b4 - pyo3::impl_::trampoline::trampoline_unraisable::{{closure}}::ha3808cd18eb9c05a
at /Users/goldbaum/Documents/pyo3/src/impl_/trampoline.rs:235:54
23: 0x103e67f00 - std::panicking::try::do_call::haf1c85a8e3efdaf4
at /Users/goldbaum/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/panicking.rs:557:40
24: 0x103e69c74 - ___rust_try
25: 0x103e69bc8 - std::panicking::try::h1a8dc01c8f0edfde
at /Users/goldbaum/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/panicking.rs:520:19
26: 0x103e69bc8 - std::panic::catch_unwind::h7cbb307187149027
at /Users/goldbaum/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/panic.rs:358:14
27: 0x103e691dc - pyo3::impl_::trampoline::trampoline_unraisable::h192e13d6550f4cab
at /Users/goldbaum/Documents/pyo3/src/impl_/trampoline.rs:235:26
28: 0x103e63584 - pyo3::impl_::trampoline::dealloc::h55f3f62fd56929e3
at /Users/goldbaum/Documents/pyo3/src/impl_/trampoline.rs:149:5
29: 0x103e6bcc8 - pyo3::impl_::pyclass::tp_dealloc::h8d0e77cfb936ebd4
at /Users/goldbaum/Documents/pyo3/src/impl_/pyclass.rs:1154:5
30: 0x1032fe340 - __PyFrame_ClearExceptCode
31: 0x1032cf7c4 - __PyEval_EvalFrameDefault
32: 0x1031cbeac - __PyObject_FastCallDictTstate
33: 0x1031ccf70 - __PyObject_Call_Prepend
34: 0x10323be04 - _slot_tp_call
35: 0x1031cc0fc - __PyObject_MakeTpCall
36: 0x1032d6280 - __PyEval_EvalFrameDefault
37: 0x1031cbeac - __PyObject_FastCallDictTstate
38: 0x1031ccf70 - __PyObject_Call_Prepend
39: 0x10323be04 - _slot_tp_call
40: 0x1031cca68 - __PyObject_Call
41: 0x1032d7c14 - __PyEval_EvalFrameDefault
42: 0x1031cbeac - __PyObject_FastCallDictTstate
43: 0x1031ccf70 - __PyObject_Call_Prepend
44: 0x10323be04 - _slot_tp_call
45: 0x1031cc0fc - __PyObject_MakeTpCall
46: 0x1032d6280 - __PyEval_EvalFrameDefault
47: 0x1031cbeac - __PyObject_FastCallDictTstate
48: 0x1031ccf70 - __PyObject_Call_Prepend
49: 0x10323be04 - _slot_tp_call
50: 0x1031cc0fc - __PyObject_MakeTpCall
51: 0x1032d6280 - __PyEval_EvalFrameDefault
52: 0x1031cbeac - __PyObject_FastCallDictTstate
53: 0x1031ccf70 - __PyObject_Call_Prepend
54: 0x10323be04 - _slot_tp_call
55: 0x1031cc0fc - __PyObject_MakeTpCall
56: 0x1032d6280 - __PyEval_EvalFrameDefault
57: 0x1032ccc54 - _PyEval_EvalCode
58: 0x10332dc40 - _run_mod
59: 0x10332c028 - __PyRun_SimpleFileObject
60: 0x10332ba5c - __PyRun_AnyFileObject
61: 0x10334faf4 - _Py_RunMain
62: 0x10334fec8 - _pymain_main
63: 0x10334ff68 - _Py_BytesMain
thread caused non-unwinding panic. aborting.
Fatal Python error: Aborted
Current thread 0x00000001f8d0c840 (most recent call first):
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/python.py", line 159 in pytest_pyfunc_call
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/python.py", line 1627 in runtest
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 174 in pytest_runtest_call
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 242 in <lambda>
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 341 in from_call
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 241 in call_and_report
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 132 in runtestprotocol
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 113 in pytest_runtest_protocol
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/main.py", line 362 in pytest_runtestloop
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/main.py", line 337 in _main
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/main.py", line 283 in wrap_session
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/main.py", line 330 in pytest_cmdline_main
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/config/__init__.py", line 175 in main
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/config/__init__.py", line 201 in console_main
File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/bin/pytest", line 8 in <module>
nox > Command pytest -s failed with exit code -6
nox > Session python failed.
I did the test with this diff:
diff --git a/examples/getitem/noxfile.py b/examples/getitem/noxfile.py
index 23d967cc..551007a9 100644
--- a/examples/getitem/noxfile.py
+++ b/examples/getitem/noxfile.py
@@ -5,4 +5,4 @@ import nox
def python(session: nox.Session):
session.env["MATURIN_PEP517_ARGS"] = "--profile=dev"
session.install(".[dev]")
- session.run("pytest")
+ session.run("pytest", "-s")
diff --git a/examples/getitem/src/lib.rs b/examples/getitem/src/lib.rs
index ba850a06..cea2ef7b 100644
--- a/examples/getitem/src/lib.rs
+++ b/examples/getitem/src/lib.rs
@@ -9,7 +9,7 @@ enum IntOrSlice<'py> {
Slice(Bound<'py, PySlice>),
}
-#[pyclass]
+#[pyclass(freelist = 2)]
struct ExampleContainer {
// represent the maximum length our container is pretending to be
max_length: i32,
Are there any examples elsewhere of how we set up panic traps that I might look at?
FWIW, any panics in this function aren't new in this PR, so I don't think using unwrap()
on a poisoned mutex actually changes any observable behavior if this code can panic in practice. I also think that if the mutex is poisoned there's not much we can do besides propagate the panic.
The freelist wraps a boxed slice (with your suggestion) and it uses indexing so on debug builds it's possible for an OOB access to generate a panic. I don't immediately see other ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this LGTM 👍
Based on your observation that poisoned mutexes are only a problem if these functions already had panics, let's not worry here.
* make FreeList explicitly a wrapper for *mut PyObject * fix multithreaded access to freelist pyclasses * add changelog entry * use a GILOnceCell to initialize the freelist * respond to code review * skip test on wasm --------- Co-authored-by: David Hewitt <[email protected]>
* make FreeList explicitly a wrapper for *mut PyObject * fix multithreaded access to freelist pyclasses * add changelog entry * use a GILOnceCell to initialize the freelist * respond to code review * skip test on wasm --------- Co-authored-by: David Hewitt <[email protected]>
Fixes #4894
Wraps the freelist in a mutex.
I removed the generic from the
FreeList
struct and renamed itPyObjectFreeList
. That makes it much easier to add anunsafe impl for Send
. The generic parameter inFreeList
is not used for anything but*mut PyObject
.Note: I haven't benchmarked whether this causes multithreaded scaling issues, but also scaling badly is better than segfaulting 🤷♂️