Skip to content

enhancement(antithesis): enable forwarder disk persistence#1800

Draft
blt wants to merge 1 commit into
blt/antithesis-workload-samplingfrom
blt/antithesis-disk-persistence
Draft

enhancement(antithesis): enable forwarder disk persistence#1800
blt wants to merge 1 commit into
blt/antithesis-workload-samplingfrom
blt/antithesis-disk-persistence

Conversation

@blt
Copy link
Copy Markdown
Contributor

@blt blt commented Jun 2, 2026

Summary

Sample forwarder_storage_max_size_in_bytes 50/50 on/off with forwarder_storage_path on a persistent compose volume, so the on-disk retry queue and restart-recovery paths run.

Bug discovered, with persistence on, a network partition fills the disk-backed retry queue, and the forwarder logs error! per failed retry attempt. The size of this spew is relative to the size of the inbound backlog.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

How did you test this PR?

References

N/A

Copy link
Copy Markdown
Contributor Author

blt commented Jun 2, 2026

@datadog-prod-us1-4
Copy link
Copy Markdown

datadog-prod-us1-4 Bot commented Jun 2, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

DataDog/saluki | unit-tests-windows-amd64   View in Datadog   GitLab

See error Compilation error in parallel_driver_sketchburst.rs:5:14: cannot find `unix` in `os` due to item being gated.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: a8535e6 | Docs | Datadog PR Page | Give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Jun 2, 2026

Binary Size Analysis (Agent Data Plane)

Baseline: 93c4763 · Comparison: a8535e6 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 38.05 MiB (baseline) vs 38.00 MiB (comparison)
Size Change: -49.59 KiB (-0.13%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
core +18.77 KiB 1473
figment -17.90 KiB 145
tokio -10.58 KiB 432
[sections] -9.16 KiB 7
alloc -7.15 KiB 365
h2 +6.74 KiB 64
axum +6.72 KiB 146
agent_data_plane::internal::env -6.69 KiB 19
hyper -5.95 KiB 158
tonic -5.63 KiB 166
serde_core -4.48 KiB 57
anon.b1989804c321c6229c83a9d121c3ffb0.48.llvm.15956482214170139656 +4.00 KiB 1
anon.ae75cd3060d8b952a6fd466cba97c1c5.103.llvm.298110800033293764 -3.91 KiB 1
agent_data_plane::components::ottl_transform_processor -3.70 KiB 8
http_body_util +3.55 KiB 21
anon.6d801117e0fee96d3bee97e0acea3070.11.llvm.12731641473472338063 -3.06 KiB 1
anon.571659addcceebb7758f58c63ff9e871.918.llvm.17979140543825394571 +3.06 KiB 1
serde_json +3.03 KiB 33
anon.ea9ef83e85fbace166653a6d1027cb2d.349.llvm.6885723478920268962 +2.98 KiB 1
serde_yaml -2.96 KiB 30
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW] +31.2Ki  [NEW] +31.0Ki    agent_data_plane::internal::env::workload::RemoteAgentWorkloadProvider::from_configuration::_{{closure}}::h7d89a227ad85041c
  [NEW] +5.44Ki  [NEW] +5.29Ki    _<figment::value::magic::RelativePathBuf as figment::value::magic::Magic>::deserialize_from::hcfd625ee9333661b
  [NEW] +5.19Ki  [NEW] +4.89Ki    alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::Edge>::insert_recursing::h5a7d8ef71dfe1d76
  [NEW] +5.00Ki  [NEW] +4.70Ki    alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::Edge>::insert_recursing::h0fe2dbb6a6a94df1
  [NEW] +4.99Ki  [NEW] +4.84Ki    _<figment::value::magic::Tagged<T> as figment::value::magic::Magic>::deserialize_from::he774f97a97ff55b1
  [NEW] +4.83Ki  [NEW] +4.58Ki    agent_data_plane::components::ottl_filter_processor::config::_::_<impl serde_core::de::Deserialize for agent_data_plane::components::ottl_filter_processor::config::OttlFilterConfig>::deserialize::h4f4429612e2653e9
  [NEW] +4.77Ki  [NEW] +4.47Ki    alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::Edge>::insert_recursing::h859019f427a7299f
  [NEW] +4.74Ki  [NEW] +4.58Ki    _<figment::value::de::ConfiguredValueDe<I> as serde_core::de::Deserializer>::deserialize_struct::ha66bb36cad767a6d
  [NEW] +4.52Ki  [NEW] +4.22Ki    alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::Edge>::insert_recursing::hea023ce38d3adc08
  [DEL] -4.43Ki  [DEL] -4.28Ki    _<figment::value::magic::Tagged<T> as figment::value::magic::Magic>::deserialize_from::hde75f4635d40ad0a
 -84.8% -4.51Ki -86.5% -4.51Ki    alloc::collections::btree::map::BTreeMap<K,V,A>::insert::h25c7c1495d390921
  [DEL] -4.64Ki  [DEL] -4.49Ki    _<figment::value::magic::RelativePathBuf as figment::value::magic::Magic>::deserialize_from::h6ff9f06009392c01
  [DEL] -4.79Ki  [DEL] -4.63Ki    _<figment::value::de::ConfiguredValueDe<I> as serde_core::de::Deserializer>::deserialize_struct::h7a39627207e3f3a4
  [DEL] -4.86Ki  [DEL] -4.60Ki    agent_data_plane::components::ottl_transform_processor::config::_::_<impl serde_core::de::Deserialize for agent_data_plane::components::ottl_transform_processor::config::OttlTransformConfig>::deserialize::hc55bc685ff741938
  [DEL] -5.02Ki  [DEL] -4.87Ki    _<figment::value::magic::Tagged<T> as figment::value::magic::Magic>::deserialize_from::h4482918f844c4c1b
  [DEL] -5.08Ki  [DEL] -4.98Ki    alloc::collections::btree::map::BTreeMap<K,V,A>::insert::hba0bacef74150a71
  [DEL] -5.53Ki  [DEL] -5.38Ki    _<figment::value::magic::RelativePathBuf as figment::value::magic::Magic>::deserialize_from::hd72d65f5176e7f06
  -1.0% -11.2Ki  -1.0% -11.2Ki    [section .gcc_except_table]
 -32.6% -18.7Ki -32.6% -18.6Ki    agent_data_plane::internal::env::workload::build_collector::_{{closure}}::h8518305f145d7e76
 -32.7% -18.7Ki -32.8% -18.7Ki    agent_data_plane::internal::env::ADPEnvironmentProvider::from_configuration::_{{closure}}::h01476d2b9c923f6d
  -0.3% -32.8Ki  -0.5% -43.1Ki    [10722 Others]
  -0.1% -49.6Ki  -0.2% -60.8Ki    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Jun 2, 2026

Regression Detector (Agent Data Plane)

Run ID: 215add7c-e2c0-4cef-905e-9db8c5b9fef1
Baseline: 93c47633 · Comparison: a8535e63 · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ +5.88 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ +3.00 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ +2.47 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ +2.26 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ +0.99 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ -0.21 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ +0.20 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ +0.04 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ -0.02 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ +0.01 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ +0.00 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ -0.06 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ -0.17 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ -0.23 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ -0.29 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ -0.30 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ -0.33 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ +0.34 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ -0.38 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ -0.39 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ -0.43 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ +0.57 metrics profiles logs
quality_gates_rss_idle memory ⚪ -0.81 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ -0.86 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu ⚪ -0.90 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ -1.72 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ +1.73 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ -2.18 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ -3.06 metrics profiles logs
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ -4.15 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu 🟢 -6.55 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 124 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 39.8 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 61.7 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 186 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 26.9 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

@blt blt force-pushed the blt/antithesis-workload-sampling branch from dd0c580 to 6945527 Compare June 2, 2026 16:26
@blt blt force-pushed the blt/antithesis-disk-persistence branch from 04cada1 to e3b7d43 Compare June 2, 2026 16:27
@blt blt force-pushed the blt/antithesis-workload-sampling branch from 6945527 to 6e47ff1 Compare June 2, 2026 20:16
@blt blt force-pushed the blt/antithesis-disk-persistence branch from e3b7d43 to 1d74c55 Compare June 2, 2026 20:16
@blt blt force-pushed the blt/antithesis-workload-sampling branch from 6e47ff1 to b291254 Compare June 2, 2026 20:33
@blt blt force-pushed the blt/antithesis-disk-persistence branch 2 times, most recently from 1a275ee to 52a95f4 Compare June 2, 2026 20:48
@blt blt force-pushed the blt/antithesis-workload-sampling branch 2 times, most recently from 2ac1e74 to 90323e5 Compare June 2, 2026 21:00
@blt blt force-pushed the blt/antithesis-disk-persistence branch 2 times, most recently from 1925259 to 709025d Compare June 2, 2026 21:27
@blt blt force-pushed the blt/antithesis-workload-sampling branch 2 times, most recently from 5a9453c to e5880cc Compare June 2, 2026 21:37
@blt blt force-pushed the blt/antithesis-disk-persistence branch from 709025d to 00d479d Compare June 2, 2026 21:37
@blt blt force-pushed the blt/antithesis-workload-sampling branch from e5880cc to 3662eca Compare June 2, 2026 22:26
@blt blt force-pushed the blt/antithesis-disk-persistence branch from 00d479d to 9c8062a Compare June 2, 2026 22:26
@blt blt force-pushed the blt/antithesis-workload-sampling branch from 3662eca to 648ccda Compare June 2, 2026 22:47
@blt blt force-pushed the blt/antithesis-disk-persistence branch from 9c8062a to 6c92cb7 Compare June 2, 2026 22:47
@blt blt force-pushed the blt/antithesis-workload-sampling branch from 648ccda to ed10986 Compare June 2, 2026 23:36
@blt blt force-pushed the blt/antithesis-disk-persistence branch from 6c92cb7 to 564e457 Compare June 2, 2026 23:36
@blt blt changed the title test(antithesis): enable forwarder disk persistence — flags a log-amplification bug test(antithesis): enable forwarder disk persistence Jun 3, 2026
@blt blt force-pushed the blt/antithesis-workload-sampling branch from ed10986 to 46dd5e3 Compare June 3, 2026 17:08
@blt blt force-pushed the blt/antithesis-disk-persistence branch from 564e457 to b799f47 Compare June 3, 2026 17:09
@blt blt force-pushed the blt/antithesis-workload-sampling branch from 46dd5e3 to ac24172 Compare June 3, 2026 20:51
@blt blt force-pushed the blt/antithesis-disk-persistence branch from b799f47 to 7863c49 Compare June 3, 2026 20:51
@blt blt force-pushed the blt/antithesis-workload-sampling branch from ac24172 to 7d8bd70 Compare June 3, 2026 21:18
@blt blt force-pushed the blt/antithesis-disk-persistence branch 2 times, most recently from 3f9f36c to 6226dd1 Compare June 3, 2026 21:30
@blt blt force-pushed the blt/antithesis-workload-sampling branch 2 times, most recently from cf3b123 to 6320589 Compare June 3, 2026 23:58
@blt blt force-pushed the blt/antithesis-disk-persistence branch from 6226dd1 to 728c300 Compare June 3, 2026 23:58
@blt blt changed the title test(antithesis): enable forwarder disk persistence enhancement(antithesis): enable forwarder disk persistence Jun 4, 2026
…lification bug

Sample forwarder_storage_max_size_in_bytes 50/50 on/off with forwarder_storage_path
on a persistent compose volume, so the on-disk retry queue and restart-recovery paths
run for the first time.

BUG this branch surfaces: with persistence on, a network partition fills the
disk-backed retry queue, and the forwarder logs error! per failed retry attempt
(io.rs:462/472/421). Over a large backlog that is unbounded log amplification — it
floods per-moment output, tripping 'very high output ... fail to materialize' at
cx=134896 on run 4ecf6d1b, which masks other findings. The same path also opens the
non-atomic torn-write hunt at persisted.rs:184 under node termination.
@blt blt force-pushed the blt/antithesis-workload-sampling branch from 6320589 to 59e0673 Compare June 4, 2026 18:59
@blt blt force-pushed the blt/antithesis-disk-persistence branch from 728c300 to a8535e6 Compare June 4, 2026 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/test All things testing: unit/integration, correctness, SMP regression, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant