Skip to content

Migrate RSV to volume-invariant sampling and harden skiplist economics#498

Merged
HudsonGraeme merged 2 commits into
testnetfrom
investigate/rsv-probabilistic-and-uncap-dispatch
May 14, 2026
Merged

Migrate RSV to volume-invariant sampling and harden skiplist economics#498
HudsonGraeme merged 2 commits into
testnetfrom
investigate/rsv-probabilistic-and-uncap-dispatch

Conversation

@HudsonGraeme
Copy link
Copy Markdown
Member

@HudsonGraeme HudsonGraeme commented May 14, 2026

Why

The 14.8.4 throughput uplift exposed a sampling gap. RSV used a per-(hotkey, tempo) budget of 20 verifications — designed against ~500 expected submissions per miner per tempo. At our current throughput on this validator (~7,000 dispatches per miner per tempo), the realised sample rate collapsed from the design target of 4% to ~0.3%.

Stake-weighted consensus magnifies this. On subnet 2:

validator stake share
uid 49 44.40%
uid 14 (ours) 35.70%
uid 5 7.09%
uid 187 6.24%
uid 89 4.49%
top 5 cumulative 97.91%
effective N (1/HHI) 3.0

So aggregate-detection security is dominated by a handful of high-stake validators rather than the count of all validators. Our per-validator detection rate is the load-bearing factor.

Changes

  1. Probabilistic-pure sampling. Replace the budget-gated should_sample with a flat rng.range(0..RSV_EXPECTED_SUBS_PER_TEMPO) < VERIFICATION_SAMPLES_PER_TEMPO roll. Restores 4% sample rate at any throughput. Drop the sample_budget HashMap and its retention sweep.
  2. VERIFICATION_STRIKES_REQUIRED: 3 → 1. Closes the rate-of-attack loophole where a miner trickling cheats <3 per 60-min window never accumulates the threshold. zk verification is deterministic; legitimate false positives are ~zero, so single-strike is the cleaner choice.
  3. VERIFICATION_SKIPLIST_TEMPOS: 5 → 20 (~24h zero-weight from us). At 35.7% stake × 24h × ~0.02 TAO/day top-miner emission, per-cheat E[penalty] is ~3× E[gain] at restored 4% sample rate. EV-negative.
  4. --dispatch-ceiling Option<usize>, default None (uncapped). Was verification_concurrency * 8 — an artifact of the pre-RSV era when every proof entered the verifier. With RSV skipping ~96% of verifications, dispatch CPU and verify CPU decouple. Adaptive per-miner caps (sum currently ~6,400 across the metagraph) and pending_verifications cap remain the real backpressure.

Verification

cargo check --workspace, cargo clippy --workspace --tests -- -D warnings, cargo fmt --check, cargo test --workspace --lib all clean. New unit test (should_sample_rate_is_volume_invariant) confirms sample rate stays within 0.5% of the 4% target over 100k trials.

Rollout

Ship as 14.8.5. Operators on default config see:

  • ~4% verification rate per request regardless of dispatch volume (was ~0.3% at high volume)
  • Skiplist on first detected cheat, 24h zero-weight contribution
  • Dispatch fan-out unbounded by default — adaptive caps govern

Operators wanting to keep the conservative cap can pass --dispatch-ceiling=N.

Summary by CodeRabbit

  • New Features

    • Added a CLI option to set a configurable request dispatch ceiling.
  • Chores

    • Adjusted verification thresholds to reduce required strikes and increase tempo skiplist tolerance.
    • Simplified sampling behavior to a probabilistic, volume-invariant approach.
    • Dispatch-rate limiting now honors the configured ceiling when provided.

Review Change Stack

The 14.8.4 throughput uplift exposed a dilution gap in the previous
sample-budget mechanism: with 20 verifications budgeted per hotkey per
tempo and our validator now driving ~7000 dispatches per miner per
tempo, the realised per-validator sample rate collapsed from the design
target (4%) to ~0.3%. Stake distribution analysis on subnet 2 shows
top-2 validators hold 80% of stake (effective N via 1/HHI = 3.0), so
our individual detection rate dominates aggregate consensus security
rather than averaging it out.

Replace the budget gate with a flat probabilistic roll
(VERIFICATION_SAMPLES_PER_TEMPO / RSV_EXPECTED_SUBS_PER_TEMPO = 4%),
restoring the design-intent sample rate at any throughput. Drop the
sample_budget field and its persistence ratchets. Tighten strike
threshold to 1 (closes the rate-of-attack loophole where a miner
trickling cheats below 3-per-60-min escapes accumulation) and extend
skiplist to 20 tempos (~24h zero-weight contribution from us, sized
against the ~0.02 TAO/day top-miner emission at our stake share).

Separately introduce --dispatch-ceiling Option<usize> so the validator
no longer ties in-flight QUIC fan-out to verification CPU. Default
None (uncapped) — adaptive per-miner caps and the pending_verifications
buffer remain the real backpressure. The verification_concurrency * 8
formula was an artifact of the pre-RSV era when every proof entered
the verifier and CPU saturation was the dominant constraint.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bac137ed-36a3-476b-b3f4-2e3228b1710e

📥 Commits

Reviewing files that changed from the base of the PR and between 50d847e and 4723666.

📒 Files selected for processing (2)
  • crates/sn2-validator/src/cli.rs
  • crates/sn2-validator/src/rsv.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/sn2-validator/src/cli.rs
  • crates/sn2-validator/src/rsv.rs

Walkthrough

Verification constants changed; RsvManager removes per-(hotkey,tempo) sampling budgets and uses volume-invariant random sampling; a new optional dispatch_ceiling CLI/config option is added and used to bound dispatching instead of deriving it from concurrency.

Changes

Verification and Dispatch Tuning

Layer / File(s) Summary
Verification Constants Update
crates/sn2-types/src/constants.rs
VERIFICATION_STRIKES_REQUIRED is lowered to 1 (from 3) and VERIFICATION_SKIPLIST_TEMPOS is raised to 20 (from 5); VERIFICATION_STRIKES_WINDOW_BLOCKS remains 7200.
RSV Sampling Budget Removal and Simplification
crates/sn2-validator/src/rsv.rs
RsvManager removes the sample_budget field and per-(hotkey, tempo) budget tracking. should_sample is simplified to a volume-invariant random decision; tests are replaced/updated to assert statistical rate invariance and hotkey-keyed strike behavior.
Dispatch Ceiling Configuration
crates/sn2-validator/src/cli.rs, crates/sn2-validator/src/config.rs, crates/sn2-validator/src/validator_loop/dispatch.rs
Adds --dispatch-ceiling CLI option wired into ValidatorConfig and changes dispatch logic to use config.dispatch_ceiling (default usize::MAX) instead of verification_concurrency × 8.

Possibly related PRs

Suggested labels

run-build

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibbled constants late at night,
Strikes trimmed down, tempos set to light,
Budgets hopped away with cheer,
Dispatch got a ceiling near,
🥕 Simpler runs — a bunny's delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main changes: migrating RSV to volume-invariant sampling and hardening skiplist economics through constant adjustments and sampling logic redesign.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch investigate/rsv-probabilistic-and-uncap-dispatch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/sn2-validator/src/cli.rs (1)

57-58: ⚡ Quick win

Add help text for the new CLI flag.

The --dispatch-ceiling flag lacks documentation. Users won't understand its purpose, valid values, or when to set it without examining the code or external documentation.

📝 Suggested help text
-    #[arg(long)]
+    #[arg(long, help = "Maximum concurrent requests in flight (tasks + verifications + pending). Defaults to unbounded; backpressure from pending_verifications cap and per-miner limits still applies")]
     pub dispatch_ceiling: Option<usize>,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/sn2-validator/src/cli.rs` around lines 57 - 58, Add user-facing help
text for the CLI flag by annotating the dispatch_ceiling field (the struct field
named dispatch_ceiling currently annotated with #[arg(long)]) with a descriptive
help string explaining its purpose, acceptable values (e.g., positive integer or
"none"), and when to use it; update the attribute to include help = "..." (or
add a doc comment above the field) so the --dispatch-ceiling flag shows this
guidance in the CLI help output.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/sn2-validator/src/cli.rs`:
- Around line 57-58: Add user-facing help text for the CLI flag by annotating
the dispatch_ceiling field (the struct field named dispatch_ceiling currently
annotated with #[arg(long)]) with a descriptive help string explaining its
purpose, acceptable values (e.g., positive integer or "none"), and when to use
it; update the attribute to include help = "..." (or add a doc comment above the
field) so the --dispatch-ceiling flag shows this guidance in the CLI help
output.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9c524874-065b-4c6b-8741-86e226bdcfb8

📥 Commits

Reviewing files that changed from the base of the PR and between 66d4493 and 50d847e.

📒 Files selected for processing (5)
  • crates/sn2-types/src/constants.rs
  • crates/sn2-validator/src/cli.rs
  • crates/sn2-validator/src/config.rs
  • crates/sn2-validator/src/rsv.rs
  • crates/sn2-validator/src/validator_loop/dispatch.rs

…flag

Two rsv tests asserted behavior that is unreachable at the new
STRIKES_REQUIRED=1 setting: record_strike_below_threshold_no_skiplist
checked that the first strike was a no-op, and strike_aging_removes_old_strikes
required accumulating multiple strikes for the in-window pruning loop
to be observable. Both behaviors only manifest at threshold >= 2;
deleting them rather than reshaping since the underlying aging
code path remains intact for any future threshold tuning.

Annotate --dispatch-ceiling with a help string covering default behavior
(uncapped, governed by adaptive caps + pending buffer) and when an
operator would set a hard cap.
@HudsonGraeme HudsonGraeme merged commit bac1d9f into testnet May 14, 2026
18 checks passed
@HudsonGraeme HudsonGraeme deleted the investigate/rsv-probabilistic-and-uncap-dispatch branch May 14, 2026 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant