Skip to content

Commit a39d78e

Browse files
fix(profiling): fix SystemError when collecting memory profiler events [backport 2.20] (#12110)
Backport e3045a1 from #12075 to 2.20. We added locking to the memory profiler to address crashes. These locks are mostly "try" locks, meaning we bail out if we can't acquire them right away. This was done defensively to mitigate the possibility of deadlock until we fully understood why the locks are needed and could guarantee their correctness. But as a result of using try locks, the `iter_events` function in particular can fail if the memory profiler lock is contended when it tries to collect profiling events. The function then returns NULL, leading to SystemError exceptions because we don't set an error. Even if we set an error, returning NULL isn't the right thing to do. It'll basically mean we wait until the next profile iteration, still accumulating events in the same buffer, and try again to upload the events. So we're going to get multiple iteration's worth of events. The right thing to do is take the lock unconditionally in `iter_events`. We can allocate the new tracker outside the memory allocation profiler lock so that we don't need to worry about reentrancy/deadlock issues if we start profiling that allocation. Then, the only thing we do under the lock is swap out the global tracker, so it's safe to take the lock unconditionally. Fixes #11831 TODO - regression test?
1 parent 2ce46df commit a39d78e

File tree

2 files changed

+19
-7
lines changed

2 files changed

+19
-7
lines changed

ddtrace/profiling/collector/_memalloc.c

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -394,20 +394,28 @@ iterevents_new(PyTypeObject* type, PyObject* Py_UNUSED(args), PyObject* Py_UNUSE
394394
}
395395

396396
IterEventsState* iestate = (IterEventsState*)type->tp_alloc(type, 0);
397-
if (!iestate)
397+
if (!iestate) {
398+
PyErr_SetString(PyExc_RuntimeError, "failed to allocate IterEventsState");
398399
return NULL;
400+
}
399401

400402
memalloc_assert_gil();
401403

402-
/* reset the current traceback list */
403-
if (memlock_trylock(&g_memalloc_lock)) {
404-
iestate->alloc_tracker = global_alloc_tracker;
405-
global_alloc_tracker = alloc_tracker_new();
406-
memlock_unlock(&g_memalloc_lock);
407-
} else {
404+
/* Reset the current traceback list. Do this outside lock so we can track it,
405+
* and avoid reentrancy/deadlock problems, if we start tracking the raw
406+
* allocator domain */
407+
alloc_tracker_t* tracker = alloc_tracker_new();
408+
if (!tracker) {
409+
PyErr_SetString(PyExc_RuntimeError, "failed to allocate new allocation tracker");
408410
Py_TYPE(iestate)->tp_free(iestate);
409411
return NULL;
410412
}
413+
414+
memlock_lock(&g_memalloc_lock);
415+
iestate->alloc_tracker = global_alloc_tracker;
416+
global_alloc_tracker = tracker;
417+
memlock_unlock(&g_memalloc_lock);
418+
411419
iestate->seq_index = 0;
412420

413421
PyObject* iter_and_count = PyTuple_New(3);
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
fixes:
3+
- |
4+
profiling: fix SystemError from the memory profiler returning NULL when collecting events

0 commit comments

Comments
 (0)