Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix multithreaded access to freelist pyclasses #4902

Merged
merged 6 commits into from
Feb 19, 2025

Conversation

ngoldbaum
Copy link
Contributor

@ngoldbaum ngoldbaum commented Feb 10, 2025

Fixes #4894

Wraps the freelist in a mutex.

I removed the generic from the FreeList struct and renamed it PyObjectFreeList. That makes it much easier to add an unsafe impl for Send. The generic parameter in FreeList is not used for anything but *mut PyObject.

Note: I haven't benchmarked whether this causes multithreaded scaling issues, but also scaling badly is better than segfaulting 🤷‍♂️

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, gave this another pass with fresh eyes quickly this morning.

if let Some(obj) = T::get_free_list(Python::assume_gil_acquired()).insert(obj) {
let mut free_list = T::get_free_list(Python::assume_gil_acquired())
.lock()
.unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that the unwrap() here could produce a panic. We might need to install panic traps. Is there ever a way this lock gets poisoned? (similar in alloc function). 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I manually insert a panic on the line below it does indeed crash Python if I modify one of the examples to use a freelist pyclass:

tests/test_getitem.py Got a slice! 3-5, step: 1, value: 2
Got an index! 2 : value: 2
thread '<unnamed>' panicked at /Users/goldbaum/Documents/pyo3/src/impl_/pyclass.rs:962:5:
panic in freelist free
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at core/src/panicking.rs:223:5:
panic in a function that cannot unwind
stack backtrace:
   0:        0x103eb9268 - std::backtrace_rs::backtrace::libunwind::trace::h4965e0a7b4ac11d7
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/../../backtrace/src/backtrace/libunwind.rs:116:5
   1:        0x103eb9268 - std::backtrace_rs::backtrace::trace_unsynchronized::hf4fa2da75bbd5d09
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:        0x103eb9268 - std::sys::backtrace::_print_fmt::h75773692a17404c8
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/sys/backtrace.rs:66:9
   3:        0x103eb9268 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h39ba3129e355bb22
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/sys/backtrace.rs:39:26
   4:        0x103ecd278 - core::fmt::rt::Argument::fmt::h34f25d464889fcc7
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/fmt/rt.rs:177:76
   5:        0x103ecd278 - core::fmt::write::h8b50d3a0f616451a
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/fmt/mod.rs:1189:21
   6:        0x103eb7740 - std::io::Write::write_fmt::h4b3bbae7048e35f8
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/io/mod.rs:1884:15
   7:        0x103eb911c - std::sys::backtrace::BacktraceLock::print::h7934b1e389160086
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/sys/backtrace.rs:42:9
   8:        0x103eba068 - std::panicking::default_hook::{{closure}}::hbcd636b20f603d1e
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:268:22
   9:        0x103eb9e9c - std::panicking::default_hook::ha9081970ba26bc6c
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:295:9
  10:        0x103eba858 - std::panicking::rust_panic_with_hook::h9a5dc30b684e2ff4
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:801:13
  11:        0x103eba4a8 - std::panicking::begin_panic_handler::{{closure}}::hbcb5de8b840ae91c
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:667:13
  12:        0x103eb972c - std::sys::backtrace::__rust_end_short_backtrace::ha657d4b4d65dc993
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/sys/backtrace.rs:170:18
  13:        0x103eba188 - rust_begin_unwind
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:665:5
  14:        0x103ed574c - core::panicking::panic_nounwind_fmt::runtime::h13e8a6e8075ea543
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/panicking.rs:119:22
  15:        0x103ed574c - core::panicking::panic_nounwind_fmt::h4a10ecea0e21f67a
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/intrinsics/mod.rs:3535:9
  16:        0x103ed57c4 - core::panicking::panic_nounwind::ha9a59379b5f3f41a
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/panicking.rs:223:5
  17:        0x103ed58f4 - core::panicking::panic_cannot_unwind::h1bb1158913507f0a
                               at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/panicking.rs:315:5
  18:        0x103e6be70 - pyo3::impl_::pyclass::free_with_freelist::h4db796234226fcb0
                               at /Users/goldbaum/Documents/pyo3/src/impl_/pyclass.rs:953:1
  19:        0x103e67bd4 - <pyo3::pycell::impl_::PyClassObjectBase<U> as pyo3::pycell::impl_::PyClassObjectLayout<T>>::tp_dealloc::hd994ca52f6b1b031
                               at /Users/goldbaum/Documents/pyo3/src/pycell/impl_.rs:248:20
  20:        0x103e67978 - <pyo3::pycell::impl_::PyClassObject<T> as pyo3::pycell::impl_::PyClassObjectLayout<T>>::tp_dealloc::h639038ac314c91b3
                               at /Users/goldbaum/Documents/pyo3/src/pycell/impl_.rs:353:9
  21:        0x103e693ec - pyo3::impl_::trampoline::dealloc::{{closure}}::h48d3ff4ddbf5ef19
                               at /Users/goldbaum/Documents/pyo3/src/impl_/trampoline.rs:151:13
  22:        0x103e693b4 - pyo3::impl_::trampoline::trampoline_unraisable::{{closure}}::ha3808cd18eb9c05a
                               at /Users/goldbaum/Documents/pyo3/src/impl_/trampoline.rs:235:54
  23:        0x103e67f00 - std::panicking::try::do_call::haf1c85a8e3efdaf4
                               at /Users/goldbaum/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/panicking.rs:557:40
  24:        0x103e69c74 - ___rust_try
  25:        0x103e69bc8 - std::panicking::try::h1a8dc01c8f0edfde
                               at /Users/goldbaum/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/panicking.rs:520:19
  26:        0x103e69bc8 - std::panic::catch_unwind::h7cbb307187149027
                               at /Users/goldbaum/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/panic.rs:358:14
  27:        0x103e691dc - pyo3::impl_::trampoline::trampoline_unraisable::h192e13d6550f4cab
                               at /Users/goldbaum/Documents/pyo3/src/impl_/trampoline.rs:235:26
  28:        0x103e63584 - pyo3::impl_::trampoline::dealloc::h55f3f62fd56929e3
                               at /Users/goldbaum/Documents/pyo3/src/impl_/trampoline.rs:149:5
  29:        0x103e6bcc8 - pyo3::impl_::pyclass::tp_dealloc::h8d0e77cfb936ebd4
                               at /Users/goldbaum/Documents/pyo3/src/impl_/pyclass.rs:1154:5
  30:        0x1032fe340 - __PyFrame_ClearExceptCode
  31:        0x1032cf7c4 - __PyEval_EvalFrameDefault
  32:        0x1031cbeac - __PyObject_FastCallDictTstate
  33:        0x1031ccf70 - __PyObject_Call_Prepend
  34:        0x10323be04 - _slot_tp_call
  35:        0x1031cc0fc - __PyObject_MakeTpCall
  36:        0x1032d6280 - __PyEval_EvalFrameDefault
  37:        0x1031cbeac - __PyObject_FastCallDictTstate
  38:        0x1031ccf70 - __PyObject_Call_Prepend
  39:        0x10323be04 - _slot_tp_call
  40:        0x1031cca68 - __PyObject_Call
  41:        0x1032d7c14 - __PyEval_EvalFrameDefault
  42:        0x1031cbeac - __PyObject_FastCallDictTstate
  43:        0x1031ccf70 - __PyObject_Call_Prepend
  44:        0x10323be04 - _slot_tp_call
  45:        0x1031cc0fc - __PyObject_MakeTpCall
  46:        0x1032d6280 - __PyEval_EvalFrameDefault
  47:        0x1031cbeac - __PyObject_FastCallDictTstate
  48:        0x1031ccf70 - __PyObject_Call_Prepend
  49:        0x10323be04 - _slot_tp_call
  50:        0x1031cc0fc - __PyObject_MakeTpCall
  51:        0x1032d6280 - __PyEval_EvalFrameDefault
  52:        0x1031cbeac - __PyObject_FastCallDictTstate
  53:        0x1031ccf70 - __PyObject_Call_Prepend
  54:        0x10323be04 - _slot_tp_call
  55:        0x1031cc0fc - __PyObject_MakeTpCall
  56:        0x1032d6280 - __PyEval_EvalFrameDefault
  57:        0x1032ccc54 - _PyEval_EvalCode
  58:        0x10332dc40 - _run_mod
  59:        0x10332c028 - __PyRun_SimpleFileObject
  60:        0x10332ba5c - __PyRun_AnyFileObject
  61:        0x10334faf4 - _Py_RunMain
  62:        0x10334fec8 - _pymain_main
  63:        0x10334ff68 - _Py_BytesMain
thread caused non-unwinding panic. aborting.
Fatal Python error: Aborted

Current thread 0x00000001f8d0c840 (most recent call first):
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/python.py", line 159 in pytest_pyfunc_call
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/python.py", line 1627 in runtest
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 174 in pytest_runtest_call
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 242 in <lambda>
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 341 in from_call
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 241 in call_and_report
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 132 in runtestprotocol
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/runner.py", line 113 in pytest_runtest_protocol
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/main.py", line 362 in pytest_runtestloop
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/main.py", line 337 in _main
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/main.py", line 283 in wrap_session
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/main.py", line 330 in pytest_cmdline_main
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/config/__init__.py", line 175 in main
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/lib/python3.12/site-packages/_pytest/config/__init__.py", line 201 in console_main
  File "/Users/goldbaum/Documents/pyo3/examples/getitem/.nox/python/bin/pytest", line 8 in <module>
nox > Command pytest -s failed with exit code -6
nox > Session python failed.

I did the test with this diff:

diff --git a/examples/getitem/noxfile.py b/examples/getitem/noxfile.py
index 23d967cc..551007a9 100644
--- a/examples/getitem/noxfile.py
+++ b/examples/getitem/noxfile.py
@@ -5,4 +5,4 @@ import nox
 def python(session: nox.Session):
     session.env["MATURIN_PEP517_ARGS"] = "--profile=dev"
     session.install(".[dev]")
-    session.run("pytest")
+    session.run("pytest", "-s")
diff --git a/examples/getitem/src/lib.rs b/examples/getitem/src/lib.rs
index ba850a06..cea2ef7b 100644
--- a/examples/getitem/src/lib.rs
+++ b/examples/getitem/src/lib.rs
@@ -9,7 +9,7 @@ enum IntOrSlice<'py> {
     Slice(Bound<'py, PySlice>),
 }

-#[pyclass]
+#[pyclass(freelist = 2)]
 struct ExampleContainer {
     // represent the maximum length our container is pretending to be
     max_length: i32,

Are there any examples elsewhere of how we set up panic traps that I might look at?

FWIW, any panics in this function aren't new in this PR, so I don't think using unwrap() on a poisoned mutex actually changes any observable behavior if this code can panic in practice. I also think that if the mutex is poisoned there's not much we can do besides propagate the panic.

The freelist wraps a boxed slice (with your suggestion) and it uses indexing so on debug builds it's possible for an OOB access to generate a panic. I don't immediately see other ways.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this LGTM 👍

Based on your observation that poisoned mutexes are only a problem if these functions already had panics, let's not worry here.

@davidhewitt davidhewitt added this pull request to the merge queue Feb 19, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 19, 2025
@davidhewitt davidhewitt added this pull request to the merge queue Feb 19, 2025
@mgorny mgorny mentioned this pull request Feb 19, 2025
7 tasks
Merged via the queue into PyO3:main with commit 03bb5b4 Feb 19, 2025
48 checks passed
mgorny pushed a commit to mgorny/pyo3 that referenced this pull request Feb 19, 2025
* make FreeList explicitly a wrapper for *mut PyObject

* fix multithreaded access to freelist pyclasses

* add changelog entry

* use a GILOnceCell to initialize the freelist

* respond to code review

* skip test on wasm

---------

Co-authored-by: David Hewitt <[email protected]>
davidhewitt added a commit that referenced this pull request Feb 20, 2025
* make FreeList explicitly a wrapper for *mut PyObject

* fix multithreaded access to freelist pyclasses

* add changelog entry

* use a GILOnceCell to initialize the freelist

* respond to code review

* skip test on wasm

---------

Co-authored-by: David Hewitt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

freelist option for pyclass is not thread-safe on 3.13t
3 participants