Skip to content

Commit febce1f

Browse files
committed
Auto merge of rust-lang#95689 - lqd:self-profiler, r=wesleywiser
Allow self-profiler to only record potentially costly arguments when argument recording is turned on As discussed [on zulip](https://rust-lang.zulipchat.com/#narrow/stream/247081-t-compiler.2Fperformance/topic/Identifying.20proc-macro.20slowdowns/near/277304909) with `@wesleywiser,` I'd like to record proc-macro expansions in the self-profiler, with some detailed data (per-expansion spans for example, to follow rust-lang#95473). At the same time, I'd also like to avoid doing expensive things when tracking a generic activity's arguments, if they were not specifically opted into the event filter mask, to allow the self-profiler to be used in hotter contexts. This PR tries to offer: - a way to ensure a closure to record arguments will only be called in that situation, so that potentially costly arguments can still be recorded when needed. With the additional requirement that, if possible, it would offer a way to record non-owned data without adding many `generic_activity_with_arg_{...}`-style methods. This lead to the `generic_activity_with_arg_recorder` single entry-point, and the closure parameter would offer the new methods, able to be executed in a context where costly argument could be created without disturbing the profiled piece of code. - some facilities/patterns allowing to record more rustc specific data in this situation, without making `rustc_data_structures` where the self-profiler is defined, depend on other rustc crates (causing circular dependencies): in particular, spans. They are quite tricky to turn into strings (if the default `Debug` impl output does not match the context one needs them for), and since I'd also like to avoid the allocation there when arg recording is turned off today, that has turned into another flexibility requirement for the API in this PR (separating the span-specific recording into an extension trait). **edit**: I've removed this from the PR so that it's easier to review, and opened rust-lang#95739. - allow for extensibility in the future: other ways to record arguments, or additional data attached to them could be added in the future (e.g. recording the argument's name as well as its data). Some areas where I'd love feedback: - the API and names: the `EventArgRecorder` and its method for example. As well as the verbosity that comes from the increased flexibility. - if I should convert the existing `generic_activity_with_arg{s}` to just forward to `generic_activity_with_arg_recorder` + `recorder.record_arg` (or remove them altogether ? Probably not): I've used the new API in the simple case I could find of allocating for an arg that may not be recorded, and the rest don't seem costly. - [x] whether this API should panic if no arguments were recorded by the user-provided closure (like this PR currently does: it seems like an error to use an API dedicated to record arguments but not call the methods to then do so) or if this should just record a generic activity without arguments ? - whether the `record_arg` function should be `#[inline(always)]`, like the `generic_activity_*` functions ? As mentioned, r? `@wesleywiser` following our recent discussion.
2 parents c842240 + 1906b7e commit febce1f

File tree

5 files changed

+103
-13
lines changed

5 files changed

+103
-13
lines changed

compiler/rustc_codegen_gcc/src/back/write.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ use rustc_target::spec::SplitDebuginfo;
1111
use crate::{GccCodegenBackend, GccContext};
1212

1313
pub(crate) unsafe fn codegen(cgcx: &CodegenContext<GccCodegenBackend>, _diag_handler: &Handler, module: ModuleCodegen<GccContext>, config: &ModuleConfig) -> Result<CompiledModule, FatalError> {
14-
let _timer = cgcx.prof.generic_activity_with_arg("LLVM_module_codegen", &module.name[..]);
14+
let _timer = cgcx.prof.generic_activity_with_arg("LLVM_module_codegen", &*module.name);
1515
{
1616
let context = &module.module_llvm.context;
1717

compiler/rustc_codegen_llvm/src/back/lto.rs

+3-1
Original file line numberDiff line numberDiff line change
@@ -313,7 +313,9 @@ fn fat_lto(
313313
for (bc_decoded, name) in serialized_modules {
314314
let _timer = cgcx
315315
.prof
316-
.generic_activity_with_arg("LLVM_fat_lto_link_module", format!("{:?}", name));
316+
.generic_activity_with_arg_recorder("LLVM_fat_lto_link_module", |recorder| {
317+
recorder.record_arg(format!("{:?}", name))
318+
});
317319
info!("linking {:?}", name);
318320
let data = bc_decoded.data();
319321
linker.add(data).map_err(|()| {

compiler/rustc_codegen_llvm/src/back/write.rs

+1-2
Original file line numberDiff line numberDiff line change
@@ -721,8 +721,7 @@ pub(crate) fn link(
721721

722722
let mut linker = Linker::new(first.module_llvm.llmod());
723723
for module in elements {
724-
let _timer =
725-
cgcx.prof.generic_activity_with_arg("LLVM_link_module", format!("{:?}", module.name));
724+
let _timer = cgcx.prof.generic_activity_with_arg("LLVM_link_module", &*module.name);
726725
let buffer = ModuleBuffer::new(module.module_llvm.llmod());
727726
linker.add(buffer.data()).map_err(|()| {
728727
let msg = format!("failed to serialize module {:?}", module.name);

compiler/rustc_codegen_llvm/src/base.rs

+5-4
Original file line numberDiff line numberDiff line change
@@ -74,10 +74,11 @@ pub fn compile_codegen_unit(tcx: TyCtxt<'_>, cgu_name: Symbol) -> (ModuleCodegen
7474

7575
fn module_codegen(tcx: TyCtxt<'_>, cgu_name: Symbol) -> ModuleCodegen<ModuleLlvm> {
7676
let cgu = tcx.codegen_unit(cgu_name);
77-
let _prof_timer = tcx.prof.generic_activity_with_args(
78-
"codegen_module",
79-
&[cgu_name.to_string(), cgu.size_estimate().to_string()],
80-
);
77+
let _prof_timer =
78+
tcx.prof.generic_activity_with_arg_recorder("codegen_module", |recorder| {
79+
recorder.record_arg(cgu_name.to_string());
80+
recorder.record_arg(cgu.size_estimate().to_string());
81+
});
8182
// Instantiate monomorphizations without filling out definitions yet...
8283
let llvm_module = ModuleLlvm::new(tcx, cgu_name.as_str());
8384
{

compiler/rustc_data_structures/src/profiling.rs

+93-5
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ use std::time::{Duration, Instant};
9797
pub use measureme::EventId;
9898
use measureme::{EventIdBuilder, Profiler, SerializableString, StringId};
9999
use parking_lot::RwLock;
100+
use smallvec::SmallVec;
100101

101102
bitflags::bitflags! {
102103
struct EventFilter: u32 {
@@ -183,11 +184,11 @@ impl SelfProfilerRef {
183184
}
184185
}
185186

186-
// This shim makes sure that calls only get executed if the filter mask
187-
// lets them pass. It also contains some trickery to make sure that
188-
// code is optimized for non-profiling compilation sessions, i.e. anything
189-
// past the filter check is never inlined so it doesn't clutter the fast
190-
// path.
187+
/// This shim makes sure that calls only get executed if the filter mask
188+
/// lets them pass. It also contains some trickery to make sure that
189+
/// code is optimized for non-profiling compilation sessions, i.e. anything
190+
/// past the filter check is never inlined so it doesn't clutter the fast
191+
/// path.
191192
#[inline(always)]
192193
fn exec<F>(&self, event_filter: EventFilter, f: F) -> TimingGuard<'_>
193194
where
@@ -288,6 +289,66 @@ impl SelfProfilerRef {
288289
})
289290
}
290291

292+
/// Start profiling a generic activity, allowing costly arguments to be recorded. Profiling
293+
/// continues until the `TimingGuard` returned from this call is dropped.
294+
///
295+
/// If the arguments to a generic activity are cheap to create, use `generic_activity_with_arg`
296+
/// or `generic_activity_with_args` for their simpler API. However, if they are costly or
297+
/// require allocation in sufficiently hot contexts, then this allows for a closure to be called
298+
/// only when arguments were asked to be recorded via `-Z self-profile-events=args`.
299+
///
300+
/// In this case, the closure will be passed a `&mut EventArgRecorder`, to help with recording
301+
/// one or many arguments within the generic activity being profiled, by calling its
302+
/// `record_arg` method for example.
303+
///
304+
/// This `EventArgRecorder` may implement more specific traits from other rustc crates, e.g. for
305+
/// richer handling of rustc-specific argument types, while keeping this single entry-point API
306+
/// for recording arguments.
307+
///
308+
/// Note: recording at least one argument is *required* for the self-profiler to create the
309+
/// `TimingGuard`. A panic will be triggered if that doesn't happen. This function exists
310+
/// explicitly to record arguments, so it fails loudly when there are none to record.
311+
///
312+
#[inline(always)]
313+
pub fn generic_activity_with_arg_recorder<F>(
314+
&self,
315+
event_label: &'static str,
316+
mut f: F,
317+
) -> TimingGuard<'_>
318+
where
319+
F: FnMut(&mut EventArgRecorder<'_>),
320+
{
321+
// Ensure this event will only be recorded when self-profiling is turned on.
322+
self.exec(EventFilter::GENERIC_ACTIVITIES, |profiler| {
323+
let builder = EventIdBuilder::new(&profiler.profiler);
324+
let event_label = profiler.get_or_alloc_cached_string(event_label);
325+
326+
// Ensure the closure to create event arguments will only be called when argument
327+
// recording is turned on.
328+
let event_id = if profiler.event_filter_mask.contains(EventFilter::FUNCTION_ARGS) {
329+
// Set up the builder and call the user-provided closure to record potentially
330+
// costly event arguments.
331+
let mut recorder = EventArgRecorder { profiler, args: SmallVec::new() };
332+
f(&mut recorder);
333+
334+
// It is expected that the closure will record at least one argument. If that
335+
// doesn't happen, it's a bug: we've been explicitly called in order to record
336+
// arguments, so we fail loudly when there are none to record.
337+
if recorder.args.is_empty() {
338+
panic!(
339+
"The closure passed to `generic_activity_with_arg_recorder` needs to \
340+
record at least one argument"
341+
);
342+
}
343+
344+
builder.from_label_and_args(event_label, &recorder.args)
345+
} else {
346+
builder.from_label(event_label)
347+
};
348+
TimingGuard::start(profiler, profiler.generic_activity_event_kind, event_id)
349+
})
350+
}
351+
291352
/// Record the size of an artifact that the compiler produces
292353
///
293354
/// `artifact_kind` is the class of artifact (e.g., query_cache, object_file, etc.)
@@ -443,6 +504,33 @@ impl SelfProfilerRef {
443504
}
444505
}
445506

507+
/// A helper for recording costly arguments to self-profiling events. Used with
508+
/// `SelfProfilerRef::generic_activity_with_arg_recorder`.
509+
pub struct EventArgRecorder<'p> {
510+
/// The `SelfProfiler` used to intern the event arguments that users will ask to record.
511+
profiler: &'p SelfProfiler,
512+
513+
/// The interned event arguments to be recorded in the generic activity event.
514+
///
515+
/// The most common case, when actually recording event arguments, is to have one argument. Then
516+
/// followed by recording two, in a couple places.
517+
args: SmallVec<[StringId; 2]>,
518+
}
519+
520+
impl EventArgRecorder<'_> {
521+
/// Records a single argument within the current generic activity being profiled.
522+
///
523+
/// Note: when self-profiling with costly event arguments, at least one argument
524+
/// needs to be recorded. A panic will be triggered if that doesn't happen.
525+
pub fn record_arg<A>(&mut self, event_arg: A)
526+
where
527+
A: Borrow<str> + Into<String>,
528+
{
529+
let event_arg = self.profiler.get_or_alloc_cached_string(event_arg);
530+
self.args.push(event_arg);
531+
}
532+
}
533+
446534
pub struct SelfProfiler {
447535
profiler: Profiler,
448536
event_filter_mask: EventFilter,

0 commit comments

Comments
 (0)