-
Notifications
You must be signed in to change notification settings - Fork 512
test(profiling): unwind_greenlets RSS test on main without the fix [DO NOT MERGE] #18422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
taegyunkim
wants to merge
4
commits into
main
Choose a base branch
from
taegyunkim/prof-14423-test-regression-check
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+187
−0
Draft
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
caf0b62
test(profiling): add unwind_greenlets RSS-stability test (without the…
taegyunkim 41d2a4d
test(profiling): experiment - measure greenlet sampling malloc churn
taegyunkim 087072c
test(profiling): white-box guard for greenlet unwind buffer reuse
taegyunkim a73dcff
test(profiling): access greenlet buffer alloc counter via _stack subm…
taegyunkim File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
100 changes: 100 additions & 0 deletions
100
tests/profiling/collector/test_greenlet_buffer_reuse.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| """Regression guard: unwind_greenlets reuses its per-thread StackInfo buffer. | ||
|
|
||
| The buffer-reuse fix keeps the `current_greenlets` vector (and each entry's | ||
| FrameStack) alive between samples, so it only allocates new StackInfo objects | ||
| when a sample exceeds the prior peak greenlet count. Before the fix, every | ||
| sample allocated a fresh StackInfo (with a std::deque<Frame>) per tracked | ||
| greenlet -- a large source of native heap churn under gevent. | ||
|
|
||
| RSS/arena size cannot distinguish the two implementations (on the single | ||
| sampling thread, freed blocks are reused next sample, so the footprint is | ||
| identical). What differs is the *number of allocations*. The native module | ||
| exposes a cumulative counter, ``stack._stack._greenlet_buffer_alloc_count()``, that is | ||
| incremented only on the buffer-growth path. With reuse it plateaus once the | ||
| working set is sampled; a per-sample-allocation regression makes it grow | ||
| without bound. | ||
|
|
||
| This test asserts the counter stops growing after warmup. It fails on builds | ||
| without the fix (the counter symbol does not exist there). | ||
| """ | ||
|
|
||
| import os | ||
| import sys | ||
|
|
||
| import pytest | ||
|
|
||
|
|
||
| GEVENT_COMPATIBLE_WITH_PYTHON_VERSION = os.getenv("DD_PROFILE_TEST_GEVENT", False) and ( | ||
| sys.version_info[:2] < (3, 13) or (sys.version_info[:2] == (3, 13) and sys.version_info[3] != "free-threading") | ||
| ) | ||
|
|
||
|
|
||
| @pytest.mark.skipif( | ||
| not GEVENT_COMPATIBLE_WITH_PYTHON_VERSION, | ||
| reason="gevent not compatible / DD_PROFILE_TEST_GEVENT not set", | ||
| ) | ||
| @pytest.mark.subprocess( | ||
| env=dict(DD_PROFILING_OUTPUT_PPROF="/tmp/test_greenlet_buffer_reuse"), | ||
| out=None, | ||
| err=None, | ||
| ) | ||
| def test_greenlet_unwind_buffer_reuse() -> None: | ||
| from gevent import monkey | ||
|
|
||
| monkey.patch_all() | ||
|
|
||
| import gevent | ||
|
|
||
| from ddtrace.internal.datadog.profiling import stack | ||
| from ddtrace.profiling import profiler | ||
|
|
||
| N_IDLE = 500 | ||
| STACK_DEPTH = 30 | ||
| WARMUP_S = 3.0 | ||
| MEASURE_S = 5.0 | ||
|
|
||
| def _idle_deep(depth: int) -> None: | ||
| if depth > 0: | ||
| _idle_deep(depth - 1) | ||
| else: | ||
| gevent.sleep(1000) | ||
|
|
||
| def idle_greenlet() -> None: | ||
| _idle_deep(STACK_DEPTH) | ||
|
|
||
| p = profiler.Profiler() | ||
| p.start() | ||
| stack.set_interval(0.005) # 5ms (minimum) for aggressive sampling | ||
| stack.set_adaptive_sampling(False) | ||
| try: | ||
| idles = [gevent.spawn(idle_greenlet) for _ in range(N_IDLE)] | ||
| gevent.sleep(0.2) # let them register | ||
|
|
||
| # Warm up: across several full samples the reuse buffer grows to the | ||
| # peak greenlet count and then stops allocating. | ||
| gevent.sleep(WARMUP_S) | ||
| c1 = stack._stack._greenlet_buffer_alloc_count() | ||
|
|
||
| # Sustained sampling: with reuse the counter must not keep climbing. | ||
| gevent.sleep(MEASURE_S) | ||
| c2 = stack._stack._greenlet_buffer_alloc_count() | ||
|
|
||
| gevent.killall(idles, timeout=5) | ||
| finally: | ||
| p.stop() | ||
|
|
||
| # Sanity: the buffer was actually populated, i.e. greenlets were sampled and | ||
| # unwound. One full sample grows one StackInfo per leaf greenlet. | ||
| assert c1 >= N_IDLE // 2, ( | ||
| f"greenlet reuse buffer was barely populated (c1={c1}, N_IDLE={N_IDLE}); " | ||
| f"greenlets may not have been sampled, so this guard would be vacuous." | ||
| ) | ||
|
|
||
| # The actual guard: after warmup the buffer is reused, so growth ~ 0. | ||
| # Allow generous slack for incidental greenlet churn / parent-chain entries. | ||
| growth = c2 - c1 | ||
| assert growth <= N_IDLE // 10, ( | ||
| f"unwind_greenlets is allocating StackInfo per sample instead of reusing " | ||
| f"its buffer: it grew by {growth} over {MEASURE_S:.0f}s after warmup " | ||
| f"(c1={c1}, c2={c2}). Per-sample greenlet buffers are no longer reused." | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the baseline is taken only after the 3s warmup, this test can pass on the unfixed path whenever the allocator has already grown/cached enough memory during warmup to satisfy the repeated
StackInfo/dequeallocations during the 10s measurement window. In that scenario the per-sample churn still exists, butpeak_rss - rss_after_warmupstays under 20 MB, so the regression guard produces the false negative the test is meant to rule out; assert on allocation/heap-live-size churn or compare fixed-vs-baseline behavior instead of post-warmup RSS growth.Useful? React with 👍 / 👎.