enhancement(core): Add efficient trace payload (aka v1) encoder#1794
enhancement(core): Add efficient trace payload (aka v1) encoder#1794ajgajg1134 wants to merge 8 commits into
Conversation
Replaces the `add_tracer_payloads` (protobuf field 42) call with `add_idx_tracer_payloads` (protobuf field 90), which uses a string interning table for deduplication. New fields now carried: 128-bit trace ID on the chunk, sampling mechanism, origin, language version, runtime ID, and per-span env/version/component/kind. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
Regression Detector (Agent Data Plane)Run ID: Optimization Goals: ❌ 3 regressions detected
Fine details of change detection per experiment (32)Experiments configured
Bounds Checks: ✅ Passed (5)
ExplanationA change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression ( |
| let mut new_spans = Span::get_spans_from_agent_payload(&payload); | ||
| new_spans.extend(Span::get_spans_from_idx_bytes(&payload, raw_body)); |
There was a problem hiding this comment.
Are these always mutually exclusive depending on payload version?
There was a problem hiding this comment.
structurally no, but functionally at the moment yes. The trace-agent will only ever emit one or the other in a single agent payload. I'd have to double check if trace intake can support receiving both at the same time though
| pub use super::super::trace_piecemeal_include::datadog::trace::*; | ||
| } | ||
|
|
||
| /// String-indexed (`idx`) trace payload types used by the v0.2 trace intake format. |
There was a problem hiding this comment.
haha yeah, versions are... confusing, this new payload format is v1 from tracer -> trace agent, but the endpoint the trace-agent uses at intake is called api/v0.2/traces.... which internally they actually call v1... but since this is a new field it's the same endpoint version on the backend...and the non indexed version of a tracer payload is v0.7 for tracers.... it's a mess
| .customize_callback(SerdeCapableStructs) | ||
| .run_from_script(); | ||
|
|
||
| // Separate invocation for idx proto types to avoid filename collision with |
There was a problem hiding this comment.
Totally a take-it-or-leave-it thing, but it feels very opaque as an outsider to see "idx proto types" instead of just "v1"... like "v1" is immediately grokable/intuitive, but "idx" less so. Like we talk about it as the v1 protocol internally, not the "idx protocol."
There was a problem hiding this comment.
yeah this is kinda my bad. See my other comment on trace versioning. I've started to try and talk about the project as the "Efficient trace payload" aka ETP to avoid this "V1" naming collision that happens everywhere. I went with idx in the trace-agent code since the biggest difference is that strings are 'indexed'. But I'm fine with renaming this to whatever is clearest here :P
There was a problem hiding this comment.
I think ETP or v1 make the most sense since they give something more tangible / intuitive to use when discussing.
| #[derive(Debug)] | ||
| struct StringTable { | ||
| strings: Vec<MetaString>, | ||
| indices: FastHashMap<MetaString, u32>, |
There was a problem hiding this comment.
If you switch to FastIndexMap, you can drop strings and just use self.indices.insert_full to get back a tuple of (index, <maybe previous value>).. and then for lookups, you'd do self.indices.get_index_of to check if it existed and, if so, get the index for it in one fell swoop.
Although as I say that, I realize all we'd really need in that scenario is a set, not a full map.... so you would probably want to end up adding an equivalent alias for indexmap::IndexSet to lib/saluki-common/src/collections/mod.rs and then actually use FastIndexSet instead.
There was a problem hiding this comment.
ooo good call. TIL about IndexSet/IndexMap. The intern function is so much simpler now too
Binary Size Analysis (Agent Data Plane)Baseline: 695ecfa · Comparison: 8995438 · diff ✅ Binary size difference within thresholdChanges by Module
Detailed Symbol Changes |
| self.strings.push(ms.clone()); | ||
| self.indices.insert(ms, idx); | ||
| idx | ||
| let (idx, _) = self.indices.insert_full(MetaString::from(s)); |
There was a problem hiding this comment.
This will reconstruct a MetaString for every call, even if it doesn't end up actually inserting it due to already being present, so we should prefer the split get_index_of/insert_full paradigm.
There was a problem hiding this comment.
Another thought, which I'll flesh out when I have a chance: making this generic over stringtheory::CheapMetaString to avoid reconstructing when we already are passing in something that is derived from MetaString vs something that only ever comes as &str by the time we want to intern it.
pls no look tobz
Summary
Change Type
How did you test this PR?
References