Skip to content

TESTING Make DogStatsD stats collection lazy#1727

Draft
aqian01 wants to merge 2 commits into
mainfrom
andrewq/lazy-dsd-stats-collector
Draft

TESTING Make DogStatsD stats collection lazy#1727
aqian01 wants to merge 2 commits into
mainfrom
andrewq/lazy-dsd-stats-collector

Conversation

@aqian01
Copy link
Copy Markdown
Contributor

@aqian01 aqian01 commented May 22, 2026

Summary

Replace the always-connected DogStatsD stats destination with a shared lazy collector. The /dogstatsd/stats API remains available, but normal DogStatsD ingest no longer fans out every metric batch into a stats-only topology destination.

What changed

  • Added DogStatsDStatsCollector with an atomic inactive fast path and mutex-protected active collection state.
  • Kept /dogstatsd/stats response behavior, max-duration validation, 429 AlreadyRunning, and cancellation cleanup.
  • Wired the collector into the DogStatsD source so decoded metric events are recorded only when a stats request is active.
  • Removed the permanent dsd_stats_out destination and topology connection.
  • Added collector unit tests and DogStatsD source wiring coverage.

Why

The previous topology connected dsd_stats_out as a second consumer of dsd_in.metrics, which forced dispatcher fanout cloning on normal DogStatsD traffic even when no stats request was collecting. This keeps the API available while reducing normal hot-path work to an inactive atomic check.

Validation

  • cargo check --workspace && cargo check --workspace --tests
  • cargo nextest run -p saluki-components (584 passed, 1 skipped)
  • make fmt
  • git diff --check
  • make check-all reached formatting and Clippy successfully, then stopped because local vale is not installed: Please install Vale: https://vale.sh/docs/install

@dd-octo-sts dd-octo-sts Bot added area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. destination/dogstatsd-stats DogStatsD Statistics destination. labels May 22, 2026
@aqian01 aqian01 changed the title [codex] Make DogStatsD stats collection lazy TESTING Make DogStatsD stats collection lazy May 22, 2026
@datadog-datadog-prod-us1
Copy link
Copy Markdown

datadog-datadog-prod-us1 Bot commented May 22, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

Semantic PR Title Check | Check For Semantic PR Title   View in Datadog   GitHub Actions

🛟 This job is unlikely to succeed on retry. Please review your pipeline configuration. No release type found in pull request title. Required prefix missing for release indication.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 510b247 | Docs | Datadog PR Page | Give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 22, 2026

Binary Size Analysis (Agent Data Plane)

Baseline: 54dc37e · Comparison: 510b247 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 37.68 MiB (baseline) vs 37.65 MiB (comparison)
Size Change: -28.88 KiB (-0.07%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
figment -72.87 KiB 169
hyper -59.73 KiB 222
hyper_util +39.48 KiB 57
core +37.66 KiB 1996
alloc -29.13 KiB 431
h2 +14.54 KiB 148
[sections] -13.48 KiB 7
rustls +12.88 KiB 138
serde_core -10.90 KiB 86
tokio_rustls -9.88 KiB 6
tokio -9.79 KiB 1147
prost +8.78 KiB 155
saluki_components::transforms::apm_stats -8.44 KiB 23
saluki_components::sources::otlp +8.35 KiB 37
saluki_components::destinations::prometheus -8.23 KiB 8
hashbrown -7.14 KiB 91
serde_json +6.96 KiB 69
&mut serde_json -6.47 KiB 11
datadog_protos::checks_include::datadog +6.03 KiB 10
bytes -6.00 KiB 28
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW] +66.5Ki  [NEW] +66.4Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::h1b011ad4ddc53a6d
  [NEW] +23.6Ki  [NEW] +23.4Ki    saluki_components::sources::dogstatsd::drive_stream::_{{closure}}::h5c096335e4534134
  +0.2% +19.4Ki  +0.2% +21.9Ki    [11348 Others]
  [NEW] +19.5Ki  [NEW] +19.3Ki    _<saluki_components::sources::dogstatsd::DogStatsDConfiguration as saluki_core::components::sources::builder::SourceBuilder>::build::_{{closure}}::hf2af4e9852048fee
  [NEW] +16.2Ki  [NEW] +16.1Ki    saluki_components::transforms::apm_stats::span_concentrator::SpanConcentrator::flush::h8876ba1803d9a2ce
  [NEW] +14.1Ki  [NEW] +13.9Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::hf0b9bc13d27eb0ad
  [NEW] +12.8Ki  [NEW] +12.7Ki    saluki_components::transforms::apm_stats::span_concentrator::SpanConcentrator::add_span::h6e30846771675808
  +470% +10.5Ki  +498% +10.5Ki    _<core::pin::Pin<P> as core::future::future::Future>::poll::h5cdd25ed07207a7e
  +640% +9.90Ki  +714% +9.90Ki    _<hyper_util::server::conn::auto::Connection<I,S,E> as core::future::future::Future>::poll::h788172fb71973b6c
  [NEW] +9.73Ki  [NEW] +9.64Ki    h2::server::Connection<T,B>::poll_closed::hcb8a31ee425742aa
  +680% +9.62Ki  +767% +9.62Ki    _<hyper_util::server::conn::auto::Connection<I,S,E> as core::future::future::Future>::poll::hd82027b733d70d19
 -93.5% -11.7Ki -94.3% -11.7Ki    agent_data_plane::state::metrics::rules::get_datadog_agent_remappings::h377928bc659a9572
  -1.1% -11.9Ki  -1.1% -11.9Ki    [section .gcc_except_table]
  [DEL] -13.8Ki  [DEL] -13.6Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::h59e4ebbb57eb7f91
 -38.5% -13.9Ki -38.7% -13.9Ki    _<saluki_components::transforms::apm_stats::ApmStats as saluki_core::components::transforms::Transform>::run::_{{closure}}::h84b9d860b81c1dae
 -98.3% -14.2Ki -99.7% -14.2Ki    _<saluki_components::transforms::trace_sampler::TraceSampler as saluki_core::components::transforms::SynchronousTransform>::transform_buffer::h95266a14ee7bdae6
  [DEL] -19.5Ki  [DEL] -19.3Ki    _<saluki_components::sources::dogstatsd::DogStatsDConfiguration as saluki_core::components::sources::builder::SourceBuilder>::build::_{{closure}}::hf04f3aa1c0fae428
  [DEL] -21.2Ki  [DEL] -21.0Ki    _<figment::value::de::ConfiguredValueDe<I> as serde_core::de::Deserializer>::deserialize_struct::h1495d50bfb28ee14
  [DEL] -22.0Ki  [DEL] -21.9Ki    saluki_components::sources::dogstatsd::drive_stream::_{{closure}}::h22208f5cc9094b54
  [DEL] -44.5Ki  [DEL] -44.4Ki    saluki_components::transforms::apm_stats::ApmStats::process_trace::h319e78c858c0338a
  [DEL] -68.2Ki  [DEL] -68.1Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::h41fee3c69708d7c8
  -0.1% -28.9Ki  -0.1% -26.5Ki    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 22, 2026

Regression Detector (Agent Data Plane)

Run ID: 422ff05d-2e8d-4d92-a839-5563baadb33e
Baseline: 54dc37e7 · Comparison: 510b247b · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ +16.48 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ +2.94 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ +1.66 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ +1.21 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ +0.92 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ +0.92 metrics profiles logs
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ +0.85 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ +0.41 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ +0.20 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ -0.07 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ +0.04 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ -0.03 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ +0.03 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ -0.01 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ +0.01 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ -0.07 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ -0.18 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ -0.22 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ -0.27 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ -0.31 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ -0.32 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ +0.34 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ -0.45 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ -0.49 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ -0.56 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ -0.64 metrics profiles logs
quality_gates_rss_idle memory ⚪ -0.69 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ -0.77 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ -4.58 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput 🟢 +5.54 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu 🟢 -5.60 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 122 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 39.7 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 60.1 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 176 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 26.6 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

@dd-octo-sts dd-octo-sts Bot added the area/docs Reference documentation. label May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/components Sources, transforms, and destinations. area/docs Reference documentation. destination/dogstatsd-stats DogStatsD Statistics destination. source/dogstatsd DogStatsD source.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant