Skip to content

Commit

Permalink
fix(profiling): fix SystemError when collecting memory profiler event…
Browse files Browse the repository at this point in the history
…s [backport 2.19] (#12125)

Backports #12075 to 2.19

We added locking to the memory profiler to address crashes. These locks
are mostly "try" locks, meaning we bail out if we can't acquire them
right away. This was done defensively to mitigate the possibility of
deadlock until we fully understood why the locks are needed and could
guarantee their correctness. But as a result of using try locks, the
`iter_events` function in particular can fail if the memory profiler
lock
is contended when it tries to collect profiling events. The function
then returns NULL, leading to SystemError exceptions because we don't
set an error.

Even if we set an error, returning NULL isn't the right thing to do.
It'll basically mean we wait until the next profile iteration, still
accumulating events in the same buffer, and try again to upload the
events. So we're going to get multiple iteration's worth of events. The
right thing to do is take the lock unconditionally in `iter_events`. We
can allocate the new tracker outside the memory allocation profiler lock
so that we don't need to worry about reentrancy/deadlock issues if
we start profiling that allocation. Then, the only thing we do under the
lock is swap out the global tracker, so it's safe to take the lock
unconditionally.

Fixes #11831

TODO - regression test?
  • Loading branch information
nsrip-dd authored Jan 28, 2025
1 parent 5dd85b4 commit aa1fbaa
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 7 deletions.
22 changes: 15 additions & 7 deletions ddtrace/profiling/collector/_memalloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -380,18 +380,26 @@ iterevents_new(PyTypeObject* type, PyObject* Py_UNUSED(args), PyObject* Py_UNUSE
}

IterEventsState* iestate = (IterEventsState*)type->tp_alloc(type, 0);
if (!iestate)
if (!iestate) {
PyErr_SetString(PyExc_RuntimeError, "failed to allocate IterEventsState");
return NULL;
}

/* reset the current traceback list */
if (memlock_trylock(&g_memalloc_lock)) {
iestate->alloc_tracker = global_alloc_tracker;
global_alloc_tracker = alloc_tracker_new();
memlock_unlock(&g_memalloc_lock);
} else {
/* Reset the current traceback list. Do this outside lock so we can track it,
* and avoid reentrancy/deadlock problems, if we start tracking the raw
* allocator domain */
alloc_tracker_t* tracker = alloc_tracker_new();
if (!tracker) {
PyErr_SetString(PyExc_RuntimeError, "failed to allocate new allocation tracker");
Py_TYPE(iestate)->tp_free(iestate);
return NULL;
}

memlock_lock(&g_memalloc_lock);
iestate->alloc_tracker = global_alloc_tracker;
global_alloc_tracker = tracker;
memlock_unlock(&g_memalloc_lock);

iestate->seq_index = 0;

PyObject* iter_and_count = PyTuple_New(3);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
fixes:
- |
profiling: fix SystemError from the memory profiler returning NULL when collecting events

0 comments on commit aa1fbaa

Please sign in to comment.