Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scheduler group to cpu profiler #169

Merged
merged 3 commits into from
Feb 20, 2025

Conversation

travisdowns
Copy link
Member

Capture the scheduler group running at the moment a sample is taken, so we can expose this in the profile json.

Include the current scheduling group in each CPU sample. In the case of
running tasks this is the SG associated with the task, or if the reactor
non-task code is running (e.g., pollers) it will be the default (aka
"main") scheduling group.
Test that scheduling groups are recorded in the sample. Additionally for
all the tests that don't set up any explicit scheduling groups, assert
that all samples are in the main group.
@StephanDollberg
Copy link
Member

t will be the default (aka "main") scheduling group.

Feels wrong but I guess everything else will overcomplicate things.

@travisdowns
Copy link
Member Author

Feels wrong but I guess everything else will overcomplicate things.

Not sure if it was clear from the comment, but this is actually what ss does: when run_tasks finishes it sets the SG to the main group. So it kind of reflects reality in a way.

H/e agree that it is wrong, one option is to if the SG is main, to check if a task is executing (e.g., currently_executing_task), if not set it to a dummy "reactor" sg or something like that. WDYT?

@StephanDollberg
Copy link
Member

H/e agree that it is wrong, one option is to if the SG is main, to check if a task is executing (e.g., currently_executing_task), if not set it to a dummy "reactor" sg or something like that. WDYT?

Yeah something like that would do it but I am not sure it's worth doing now? Should only be a slight inconvenience in pprof.

@travisdowns
Copy link
Member Author

Yeah something like that would do it but I am not sure it's worth doing now? Should only be a slight inconvenience in pprof.

I did look into it a bit more, but it seems a bit risky since we need to access more stuff in the signal context to make this determination. Maybe let's keep it in main for now. Arguably we could just have something like "redpanda main" SG which switch to at the very start of RP init, to keep almost everything out of the main group so then we know the remaining stuff is reactor-only (and also catch leaks of main group into places we don't expect it).

@travisdowns travisdowns merged commit 45984d2 into redpanda-data:v25.1.x Feb 20, 2025
0 of 14 checks passed
@@ -42,6 +43,9 @@ struct cpu_profiler_trace {
using kernel_trace_vec = boost::container::static_vector<uintptr_t, 64>;
simple_backtrace user_backtrace;
kernel_trace_vec kernel_backtrace;
// The scheduling group active at the time the same was taken. Note that
// non-task reactor work (such as polling)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sentence incomplete

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks will fix

#include <random>
#include <chrono>
#include <optional>
#include <random>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like wrong formatting to me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

silly clang format, it's trying to match the alignment of the comment above or somthing, will fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants