Problem
With v3 the block building pipeline is finally sound for normal async and sync backing. Elastic scaling is not yet — collators today can land a candidate in an earlier relay chain block than the slot budget intended for it.
The pipeline assumes 2s build + 2s validation + 2s network propagation & statement distribution = 6s total per candidate. If a candidate built mid-slot is squeezed into the previous slot's RC block, that budget is violated: validators outside the EU cluster don't have time to receive it and statement-distribute before backing closes. This is the symptom in #12028 / #10921 — non-EU validators see fewer backable candidates than EU authors put in blocks.
The relay chain runtime doesn't enforce this today. parse_ump_signals / check_core_index only check that the candidate's cq_offset has a core assigned for the para — not that the candidate is not getting backed earlier than intended.
Solution (v3 candidates)
Enforce a minimum claim-queue position on the relay chain (runtime + provisioner): a candidate may be backed at its declared position or later, never earlier.
Collator side:
- First 2s of the slot:
cq_offset = 1 (grandchild of scheduling parent).
- Rest of the slot:
cq_offset = 2 (grand-grandchild).
Picture (3 cores/slot)
A candidate lands in the RC block of slot X iff its 2s-build + 4s-tail ends before slot X starts. Under one block per 2s, only the first-of-slot candidate qualifies for offset 1; the other two cores roll into offset 2. Steady state: 3 candidates per RC block, full ES throughput preserved, fairness restored for distant validators.
Cooperative for now
The minimum is enforced on chain, but the collator-side rule ("offset 1 for first 2s, offset 2 after") is cooperative. We do not try to enforce intra-slot timing on the relay chain — for now.
Reason: we plan to allocate more cores to a para than it actually needs and leave them idle most of the time, keeping them as spare capacity. After a resubmission the para can burn through the spares to catch back up, instead of pausing block production while the unincluded segment buffer drains. Hard timing enforcement on the relay chain would interact badly with that: a candidate using a "spare" core legitimately needs to look like it's coming in early, and a strict time check would reject it.
Misbehavior also has little upside: a candidate that lies and sets cq_offset = 1 when it isn't actually ready will just miss the seal and be re-tried as offset 2 later.
We can revisit strict enforcement later if it ever becomes worthwhile.
Impact on resubmissions
Open: does the resubmission reasoning in #11903 still hold under the new timings? Specifically:
Those drawings assumed candidates could be backed early. With minimum-offset enforcement the situation should improve — the naive path may no longer waste a core in the same way. Needs the diagrams redrawn against 2s build + 4s tail + the minimum-offset rule, then re-evaluate whether the smart-resubmission variant is still worth the extra complexity.
Scope
Notes
Problem
With v3 the block building pipeline is finally sound for normal async and sync backing. Elastic scaling is not yet — collators today can land a candidate in an earlier relay chain block than the slot budget intended for it.
The pipeline assumes
2s build + 2s validation + 2s network propagation & statement distribution = 6s totalper candidate. If a candidate built mid-slot is squeezed into the previous slot's RC block, that budget is violated: validators outside the EU cluster don't have time to receive it and statement-distribute before backing closes. This is the symptom in #12028 / #10921 — non-EU validators see fewer backable candidates than EU authors put in blocks.The relay chain runtime doesn't enforce this today.
parse_ump_signals/check_core_indexonly check that the candidate'scq_offsethas a core assigned for the para — not that the candidate is not getting backed earlier than intended.Solution (v3 candidates)
Enforce a minimum claim-queue position on the relay chain (runtime + provisioner): a candidate may be backed at its declared position or later, never earlier.
Collator side:
cq_offset = 1(grandchild of scheduling parent).cq_offset = 2(grand-grandchild).Picture (3 cores/slot)
A candidate lands in the RC block of slot X iff its 2s-build + 4s-tail ends before slot X starts. Under one block per 2s, only the first-of-slot candidate qualifies for offset 1; the other two cores roll into offset 2. Steady state: 3 candidates per RC block, full ES throughput preserved, fairness restored for distant validators.
Cooperative for now
The minimum is enforced on chain, but the collator-side rule ("offset 1 for first 2s, offset 2 after") is cooperative. We do not try to enforce intra-slot timing on the relay chain — for now.
Reason: we plan to allocate more cores to a para than it actually needs and leave them idle most of the time, keeping them as spare capacity. After a resubmission the para can burn through the spares to catch back up, instead of pausing block production while the unincluded segment buffer drains. Hard timing enforcement on the relay chain would interact badly with that: a candidate using a "spare" core legitimately needs to look like it's coming in early, and a strict time check would reject it.
Misbehavior also has little upside: a candidate that lies and sets
cq_offset = 1when it isn't actually ready will just miss the seal and be re-tried as offset 2 later.We can revisit strict enforcement later if it ever becomes worthwhile.
Impact on resubmissions
Open: does the resubmission reasoning in #11903 still hold under the new timings? Specifically:
Those drawings assumed candidates could be backed early. With minimum-offset enforcement the situation should improve — the naive path may no longer waste a core in the same way. Needs the diagrams redrawn against
2s build + 4s tail+ the minimum-offset rule, then re-evaluate whether the smart-resubmission variant is still worth the extra complexity.Scope
check_descriptor_version_and_signals(paras_inherent)slot_based/block_builder_task.rs): pickcq_offsetper-core based on elapsed-into-slot time, instead of a single value per slot iterationNotes