Skip to content

Commit aa1fbaa

Browse files
authored
fix(profiling): fix SystemError when collecting memory profiler events [backport 2.19] (#12125)
Backports #12075 to 2.19 We added locking to the memory profiler to address crashes. These locks are mostly "try" locks, meaning we bail out if we can't acquire them right away. This was done defensively to mitigate the possibility of deadlock until we fully understood why the locks are needed and could guarantee their correctness. But as a result of using try locks, the `iter_events` function in particular can fail if the memory profiler lock is contended when it tries to collect profiling events. The function then returns NULL, leading to SystemError exceptions because we don't set an error. Even if we set an error, returning NULL isn't the right thing to do. It'll basically mean we wait until the next profile iteration, still accumulating events in the same buffer, and try again to upload the events. So we're going to get multiple iteration's worth of events. The right thing to do is take the lock unconditionally in `iter_events`. We can allocate the new tracker outside the memory allocation profiler lock so that we don't need to worry about reentrancy/deadlock issues if we start profiling that allocation. Then, the only thing we do under the lock is swap out the global tracker, so it's safe to take the lock unconditionally. Fixes #11831 TODO - regression test?
1 parent 5dd85b4 commit aa1fbaa

File tree

2 files changed

+19
-7
lines changed

2 files changed

+19
-7
lines changed

ddtrace/profiling/collector/_memalloc.c

+15-7
Original file line numberDiff line numberDiff line change
@@ -380,18 +380,26 @@ iterevents_new(PyTypeObject* type, PyObject* Py_UNUSED(args), PyObject* Py_UNUSE
380380
}
381381

382382
IterEventsState* iestate = (IterEventsState*)type->tp_alloc(type, 0);
383-
if (!iestate)
383+
if (!iestate) {
384+
PyErr_SetString(PyExc_RuntimeError, "failed to allocate IterEventsState");
384385
return NULL;
386+
}
385387

386-
/* reset the current traceback list */
387-
if (memlock_trylock(&g_memalloc_lock)) {
388-
iestate->alloc_tracker = global_alloc_tracker;
389-
global_alloc_tracker = alloc_tracker_new();
390-
memlock_unlock(&g_memalloc_lock);
391-
} else {
388+
/* Reset the current traceback list. Do this outside lock so we can track it,
389+
* and avoid reentrancy/deadlock problems, if we start tracking the raw
390+
* allocator domain */
391+
alloc_tracker_t* tracker = alloc_tracker_new();
392+
if (!tracker) {
393+
PyErr_SetString(PyExc_RuntimeError, "failed to allocate new allocation tracker");
392394
Py_TYPE(iestate)->tp_free(iestate);
393395
return NULL;
394396
}
397+
398+
memlock_lock(&g_memalloc_lock);
399+
iestate->alloc_tracker = global_alloc_tracker;
400+
global_alloc_tracker = tracker;
401+
memlock_unlock(&g_memalloc_lock);
402+
395403
iestate->seq_index = 0;
396404

397405
PyObject* iter_and_count = PyTuple_New(3);
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
fixes:
3+
- |
4+
profiling: fix SystemError from the memory profiler returning NULL when collecting events

0 commit comments

Comments
 (0)