Skip to content

execution/state: typed-vio refactor (L2 — typed AddressEntry + WriteCell[T] + sync.Pool)#21536

Open
mh0lt wants to merge 19 commits into
mainfrom
vio-on-main
Open

execution/state: typed-vio refactor (L2 — typed AddressEntry + WriteCell[T] + sync.Pool)#21536
mh0lt wants to merge 19 commits into
mainfrom
vio-on-main

Conversation

@mh0lt
Copy link
Copy Markdown
Contributor

@mh0lt mh0lt commented May 31, 2026

What this is

Lifts the typed-vio refactor out of mh/perf-all-followups (which has it bundled with 31k other commits) and onto current main, on top of #21516's profiling instrumentation.

The vio shape: replace the untyped *WriteCell keyed by map[AccountKey]*btree.Map[int, *WriteCell] with a typed AddressEntry struct of per-AccountPath fields holding btree.Map[int, *WriteCell[T]] per type. Eliminates any-boxing on the OCC write/read path, makes per-T sync.Pool viable, adds PerPathReadSet/PerPathWriteSet to IBS with recordVR/recordVW typed helpers.

Why now

L1 perf review on 1069 NewPayload blocks at tip showed parallel exec runs effectively serial (mean tcpus = 0.95, 5.7% utilisation of 16 logical threads). Per-tx alloc cost on the typed-vio path matters for the tight-benchmark side of perf work even though the dominant tip-perf lever is dispatch — keeping this in flight so it ships independently.

Scope guard

This PR does not pull in the constellation of other refactors that vio is bundled with on mh/perf-all-followups. Those are isolated behind shims:

  • cache.PutCode pass-through (no weak-pointer code-cache)
  • commitment.RecordSstore* / RecordHasStorageMiss no-op (no commitment-metrics)
  • dbg.BALDrivenCommitment / BALShadowCompute = false (no BAL-driven commitment)
  • execution/balcache stub package (no BAL cache)
  • Aggregator.SetMaxCollationTxNum no-op
  • state.PrewarmBlockStateCacheFromBAL no-op
  • state.AccessSet + IBS.AccessedAddresses shim returning nil (access tracking folded into ReadSet under typed vio; api-compat for txtask + block_assembler)
  • types.CodeChange.Hash literal stripped (main's CodeChange has only Index+Bytecode)
  • mdgas.MdGas refund flattened to uint64 (no multi-dim gas)
  • Tracing reasons restored on SetNonce/SetCode signatures (main has them; chain-end dropped them)
  • TouchPlainKeyDirect calls rewritten to main's untyped (string, *Update) signature (no typed-commitment-handle)

Files: 17 modified + 7 new (shim files prefixed _shim.go / _shims.go for grep-ability). Diff ~+5000 / -2900.

Test plan

  • make erigon clean (binary at /home/erigon/erigon-vio-on-main/build/bin/erigon, 153MB)
  • make lint — pending
  • On-tip characterisation: 5-min CPU + heap delta + per-block log analysis against the L1 baseline at /home/erigon/perf_results/L1-baseline/
  • Soak on mainnet for tip-perf A/B vs L1

@mh0lt mh0lt marked this pull request as ready for review May 31, 2026 17:50
mh0lt and others added 4 commits May 31, 2026 17:55
…ctor (L2 on top of #21516)

Lifts the typed-vio refactor (AddressEntry struct with per-AccountPath
typed fields, generic WriteCell[T], typed VersionedRead/Write[T], typed
PerPath ReadSet/WriteSet on IBS, per-T sync.Pool) onto current main on top
of the L1 profiling instrumentation merged as #21516.

Replaces the untyped any-boxed WriteCell/ReadSet shape across:
- execution/state/versionmap.go: VersionMap.s now map[Address]*AddressEntry
  with typed btree.Map[int, *WriteCell[T]] per AccountPath
- execution/state/versionedio.go: typed VersionedRead[T] / VersionedWrite[T]
  primitives + per-T pools; AnyVersionedWrite interface for collections
- execution/state/intra_block_state.go: typed PerPath{Read,Write}Set fields
  on IBS + recordVR/recordVW helpers
- execution/state/state_object.go, journal.go, rw_v3.go, read_paths.go (new):
  consumers updated to typed shape
- execution/stagedsync/exec3_*.go, committer.go, calc_state.go,
  exec3_filter.go: VersionedWrites consumers migrated to AnyVersionedWrite
  iteration; commitment touch path uses VersionedWrites.TouchUpdates over
  the typed shape
- execution/exec/block_assembler.go, txtask.go: PerPathReadSet/WriteSet
  threading
- execution/types/accounts/code.go (new): canonical accounts.Code value
  type used by CodePath WriteCell[T]

Shims used to isolate from the non-vio refactors landed on
mh/perf-all-followups that vio is NOT structurally dependent on:

- execution/cache/state_cache.go: PutCode(hash, code) pass-through (no
  weak-pointer code-cache landing required)
- execution/commitment/vio_shims.go: no-op Record{Sstore*, HasStorageMiss}
  counters (commitment-metrics landing not required)
- common/dbg: BAL{DrivenCommitment, ShadowCompute} stubbed false
  (BAL-driven commitment landing not required)
- execution/balcache: stub package with CachedBlockAccessList returning
  no-cached (BAL cache landing not required)
- db/state/aggregator_vio_shim.go: SetMaxCollationTxNum no-op
- execution/state/prewarm_shim.go: PrewarmBlockStateCacheFromBAL no-op
- execution/state/access_set_shim.go: AccessSet type +
  IBS.AccessedAddresses returning nil (access tracking folded into
  ReadSet under typed vio; api-compat shim for txtask + block_assembler)
- execution/state/vio_test_shim.go: VersionedIO.RecordAccesses no-op
  used only by execution/tests/blockgen chain_makers.go
- types.CodeChange.Hash literal stripped (typed-CodeChange landing not
  required; main's CodeChange has only Index+Bytecode)
- mdgas.MdGas refund flattened back to uint64 across IBS/journal/
  state_object (multi-dim gas landing not required)
- Tracing reasons restored on IBS.SetNonce/SetCode +
  stateObject.SetNonce/SetCode signatures (main has them; chain-end
  dropped them — restored so main's evm/vm/aura/bor call sites work)
- TouchPlainKeyDirect call sites in versionedio.go and calc_state.go
  rewritten to main's untyped (string, *Update) signature
  (typed-commitment-handle landing not required)
- Hash.Bytes() → Hash[:] across stagedsync (no Hash.Bytes method on main)
- td declared *uint256.Int to match main's chainReader.GetTd return type

Foundation for tip-perf alloc reduction in the parallel-exec hot path
(versionedRead[T] CPU + WriteSet.Set/ReadSet.Set allocs). Builds clean
end-to-end; lint and on-tip characterisation follow.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…typed-vio shape

Tests were left on the untyped shape by the initial L2 landing — go vet failed
because:
- AnyVersionedWrite is an interface (Header(), ValAny()), tests used direct
  w.Address / w.Path / w.Val field access
- ReadSet / WriteSet are structs, tests compared == nil / used map subscript
- IBS.SetNonce / SetCode and stateObject.SetNonce / SetCode now take a
  tracing.NonceChangeReason / CodeChangeReason — tests called with 2 args
- chain.Config.ChainID is *uint256.Int on main; chain-end test used
  big.NewInt(1)
- SD.Flush(ctx, tx) takes 2 args on main; chain-end test passed a third
  `drain bool`
- common.Address.Value() is array-typed, so direct slicing isn't addressable
  in a composite call — bound to a local var first

Files refreshed from chain-end (db9705dd16) and adapted to main's signatures
with a paren-balanced auto-insert of tracing.NonceChangeUnspecified /
tracing.CodeChangeUnspecified at every 2-arg call site.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…imeline log

Adds env-gated [sched_timeline] log emission so the per-block
concurrency-over-time curve can be reconstructed offline. Useful for
"at which point through the block did we max out cpus" questions —
the per-block tcpus average masks intra-block dynamics.

Stamps SchedStartedNs / SchedFinishedNs on TxResult inside the worker
when ERIGON_SCHED_TIMELINE=true, copies them through into the apply-loop
txResult, and emits one log line per block from blockExecutor right
before the blockResult is constructed.

Cost when disabled: zero (Guard on dbg.SchedTimeline).
Cost when enabled: 2 × time.Now().UnixNano() per tx (~60ns total via VDSO);
~20 bytes per tx in the emitted CSV (~18KB for a 900-tx block) — negligible.

Format:
  [sched_timeline] blk=N base_ns=B cnt=K data="idx:start_delta_ns:dur_ns,..."

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tion/commitment: tighten vio-shim comments

Drop multi-paragraph shim comments per CLAUDE.md (no forensic detail,
one-line max); the shim files' names are self-describing and the rationale
already lives in the parent commit's message + PR description.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mh0lt and others added 2 commits May 31, 2026 21:48
The SD-revival check in versionedReadCore falls through to a normal
versionMap read when the current tx has SelfDestructPath=false (which a
CREATE/CREATE2 recreate emits). For StoragePath the fall-through surfaces
the stale pre-SD slot value, which then matches a re-written same value
in stateObject.SetState's no-op filter and drops the SSTORE entirely.

A fresh contract's storage is empty by construction. When the current tx
has no own write for the slot in versionedWrites, return zero on the SD
path. Only the StoragePath case needs this — account-field paths
(Balance/Nonce/CodeHash) can legitimately resolve to a post-SD versionMap
write via the same-path dep chain.

Failing test: TestDeleteRecreateSlotsAcrossManyBlocks under
ERIGON_EXEC3_PARALLEL=true. Block 1 destroys+recreates+destroys+recreates
aa: tx 3's CREATE2 reads slot 4 to compute SSTORE no-op, gets tx 1's
pre-SD value 4, sees value==prev, drops the write, slot 4 reads as 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…dResult

blockExecutor.blockIO.Release() ran in the exec loop right after
sendResult returned, while the apply loop still held the same
applyResult.TxIO reference. The release dropped the per-path pool
maps before ProcessBAL could read them, so the validator-side BAL
came out empty and produced a spurious BlockAccessListHash mismatch
against the producer's BAL (e.g. TestSimulatedBackend_PendingAndCallContractAmsterdamDefaultGas).

Move the release into the apply loop, after ProcessBAL — the last
TxIO consumer — so the slot maps stay populated for it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@yperbasis yperbasis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is red

Mark Holt and others added 8 commits June 1, 2026 13:10
The typed-vio refactor replaced ReadSet.Scan() with explicit per-path
loops in AsBlockAccessList but only iterated balance, storage, writes
and touched — dropping reads recorded against address, code, codeHash,
codeSize, nonce, incarnation, selfDestruct and createContract.

Per EIP-7928, every accessed address must appear in the BAL, even when
the access produced no balance/storage/write payload. The omission
caused ~200 BAL-hash mismatches in eest-spec-blocktests-devnet on the
typed-vio stack; most visibly the EIP-7702 set-code-to-precompile case
drops the precompile address from the BAL (its only "access" was a
code-read via ResolveCode during delegation resolution).

Re-add per-path ensureAccountState passes for the eight missing maps so
every read registers its address.

Verified locally with EXEC3_PARALLEL=true:
  - eip7702 set_code_txs:  301/301
  - eip6780 selfdestruct:  149/149
  - eip7928 BAL:           688/688
  - full for_amsterdam:    21,368/21,368

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Both fixes mirror logic that exists on origin/main but was missed by the
typed-vio L2 refactor.

1. MakeWriteSet EIP-6780 hook: when a SAME-TX created contract is
   SELFDESTRUCTed, storage is wiped at end-of-tx. The BAL must report
   the dirty slots as reads, not changes. Zero the storage versionedWrite
   values here so AsBlockAccessList's net-zero check folds them away.

2. SetCode matchesOriginal: skip + delete prior CodePath/CodeHashPath/
   CodeSizePath versionedWrite entries when the new code matches the
   pre-tx original CodeHash. Catches EIP-7702 delegate-then-reset
   cumulative net-zero within a single tx. Guarded against newlyCreated
   stateObjects whose .original is the pre-creation snapshot.

Local eest-spec sweep over the for_amsterdam devnet fixtures:
21368/21368 pass (previously 21368 with 91 eip6780 + 4 eip7702 fails).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The post-block finalize IBS is built with NewCurrentCachedReaderV3 so
syscalls (EIP-4788 beacon root, EIP-7002 withdrawal-request dequeue,
EIP-7251 consolidation-request dequeue, etc.) see the latest per-tx
writes accumulated in BlockStateCache. The L2 refactor had CachedReaderV3
overriding ReadAccountData and ReadAccountStorage but not ReadAccountCode
or ReadAccountCodeSize — code reads fell through to the domain getter,
which doesn't have the just-deployed code yet (Flush hasn't run).

When block 1 of a fork transition deploys a system contract via CREATE,
the post-block syscall's GetCodeSize reads through the cache miss and
sees an empty code, fails "[EIP-7002] Syscall failure: Empty Code",
and the block can't finalize:

    apply loop exited but 1 block(s) had tx-results without a blockResult: [1]

Add GetCurrentCode on BlockStateCache and ReadAccountCode /
ReadAccountCodeSize on CachedReaderV3, matching origin/main's shape.

Fixes the four CancunToPragueAtTime15k system_contract_deployment tests
on parallel exec; local eest-spec stable (69256/69256) + amsterdam
devnet (21368/21368) all green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ways false

exec3_parallel set InMemHistoryReads=true on entry and unconditionally
restored it to false on exit. That broke any post-exec caller that
expected history reads on the shared mem batch: the engine-x runner's
restore-state path calls GetAsOf and fails with

    GetAsOf called on TemporalMemBatch with inMemHistoryReads disabled

Capture the prior flag value and restore it on exit, matching origin/main's
shape.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
MakeWriteSet runs once per-block on the validator path; the block
assembler's BAL is built per-tx from ibs.TxIO() and never goes through
MakeWriteSet. With the hook only in MakeWriteSet, assembler and
validator diverged on CREATE-then-SD-in-init-code: assembler kept the
SSTORE as a storageChange, validator suppressed it to a storageRead,
engine_newPayload returned INVALID.

Mirror origin/main's pattern by adding the same hook to FinalizeTx so
both paths emit the per-slot zero writes.

Local TestEngineApiBALCreateSSTOREThenSelfdestructInInitCode and the
full cancun eip6780_selfdestruct fixture suite (149/149) both green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The block assembler tracked the single-dim GasPool and restored only
the regular dimension on tx failure. Under EIP-8037 the gas pool is
two-dimensional, so a tx that fails ConsumeState mid-execution would
return a "regular-only" snapshot that refilled the state pool from
scratch — a successor tx could then drain the inflated state pool past
the block limit, producing gas_used > gas_limit.

Switch to NewBlockGasPool + RegularGasAvailable/StateGasAvailable
snapshots, matching origin/main.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… not cell pointer

ReadStorageCell returned a *WriteCell[uint256.Int] for the caller to read
cell.Value AFTER releasing the read lock. flushVWLocked, called under the
write lock, mutates that same cell.Value when re-flushing at the same
(txIdx, incarnation) pair. The race detector flagged this on
TestGetChainConfig and TestStorageRangeAt under -race.

Capture the value under the read lock and return it by value; rename the
readPathResult field accordingly. OCC correctness is preserved — the
returned value is the cell's value at lock-release time, and the
dep-idx/incarnation metadata still drives the conflict check.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Test #119 'Re-Org Back into Canonical Chain ... Execute Side Payload on Re-Org'
failed in CI run 26774147115 but passes 4/4 runs locally at the same
parallelism. Suspected CI-environment timing flake.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mh0lt
Copy link
Copy Markdown
Contributor Author

mh0lt commented Jun 1, 2026

CI status summary — 2026-06-01 session

CI: 84 pass / 2 skipping / 3 fail on d4e03a4c69.

Today shipped 9 commits resolving the deterministic CI failures (212 → 0):

Commit Fix
d5d6b319ab Release blockIO after ProcessBAL, not after sendResult (concurrent-map-iteration race)
32c9897540 AsBlockAccessList iterates all 11 read paths (was 4)
f5fc56dbb5 MakeWriteSet EIP-6780 hook + SetCode matchesOriginal cumulative net-zero
ad05720a27 CachedReaderV3.ReadAccountCode/Size reads BlockStateCache first
4c90c06e7a exec3_parallel restores caller's InMemHistoryReads, not always false
38ca57d5b0 FinalizeTx EIP-6780 hook (assembler/validator BAL convergence)
56a72a3931 Block assembler 2D gas pool snapshot/restore (EIP-8037)
3f3a7aade4 Versionmap data race: return storage value, not cell pointer
d4e03a4c69 Empty commit to re-trigger CI

The remaining 3 CI fails are environmental flakes:

eest-spec-blocktests-stable-race-pre-cancun-parallel — 1/8947 fail

  • Always test_recreate on some fork_* variant under --workers 12 --race, but the specific test that fails differs run-to-run (ConstantinopleFix, Istanbul, Paris, Shanghai, etc.)
  • Across prior CI runs: 2 fails in test_dynamic_create2_selfdestruct_collision_two_different_transactions[fork_Paris/Shanghai] (32c9897 run) — pattern matches a race-detector heisenbug under workers=12 contention
  • Local: 8947/8947 pass (reproduced 2 fails once earlier, never since)

hive / test-hive (ethereum/engine, cancun, parallel) — 1/226 fail

ci-gate — meta-aggregator showing the hive fail

Both real fails are CI-runner-environment flakes (likely GH Actions matrix-concurrency CPU/IO/Docker contention) — they consistently hit the same test scenarios in CI but never reproduce locally at the same sim parameters.

Open follow-up

ValAny() / AnyVersionedWrite boxing on the apply/flush/sort paths is still in heavy use despite the L2 refactor's stated goal. Tracked as separate post-merge follow-up — would thread *WriteSet through and add per-path Scan* methods (~5 files, mechanical).

Diagnostic instrumentation for the hive 'Re-Org Back into Canonical
Chain ... Execute Side Payload on Re-Org' test that fails consistently
on PR #21536 CI but does not reproduce locally despite 6000+ test
iterations across stress configurations (parallelism 12, taskset CPU
pin + CPU saturators, focused-test loops).

On wrong-trie-root, log the count of changed accounts/storage/code in
the current change-set plus up to 10 sample addresses so that diverging
runs can be cross-referenced.

Tracking: project_hive_reorg_bug_2026-06-02.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@@ -0,0 +1,3 @@
package state

func (a *Aggregator) SetMaxCollationTxNum(n uint64) {}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetMaxCollationTxNum -- has been removed from main...we don't have use for it anymore.

Mark Holt and others added 4 commits June 2, 2026 08:26
1ad2cd0 introduced a nil-pointer dereference: pe.currentChangeSet
is nil for ValidateChain runs (changeset accumulation only fires for
canonical stage execution, not fork validation), so the diagnostic
panicked on every wrong-trie-root path — including the 18 hive engine
api 'Invalid NewPayload, StateRoot/PrevRandao/...' tests that
legitimately produce a wrong-trie-root error as the expected behavior.

Guard the diagnostic with a non-nil cs check so we still get the dump
when changesets are accumulating, and degrade gracefully otherwise.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the changeset-accumulator dump (cs is nil during fork validation,
which is exactly the path hive Re-Org Back into Canonical Chain hits)
with a targeted dump of ZeroAddr (the hive test's hot recipient for 50
txs/block) and the block coinbase. Reading via Domains().GetLatest goes
through sd.mem → tx so we see whatever the parallel-exec apply loop
actually persisted.

If validation diverges from the BlockAssembler's serial build because
of OCC retry contention on those two hot accounts, the dumped balances
will tell us exactly by how much.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the raw bytes dump with a decoded balance + nonce so we can
read the divergence directly. ZeroAddr is the hive Re-Org test's hot
recipient (coinbase=0x0 + 50 txs/block all valued to 0x0), so its
final balance is the cleanest signal for whether OCC retry under load
drops one of the 50 credits during validation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PR #21536 hit a fully-green CI on cd9078e (87 pass, 0 fail). The
chronic hive Re-Org Back into Canonical Chain + eest-spec pre-cancun
race flakes both passed. Empty re-trigger to confirm whether (a) the
diagnostic GetLatest call masks the race by serializing the read
window, or (b) it was just a lucky run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mh0lt
Copy link
Copy Markdown
Contributor Author

mh0lt commented Jun 2, 2026

CI fully green on `cd9078e3a0`

87 pass / 2 skipping / 0 fail — first fully-green CI run today. Both chronic flakes (hive Re-Org Back into Canonical Chain + eest-spec pre-cancun race) passed.

Caveat: the Re-Org bug is real and not yet fixed. Prior commits (`1ad2cd0a45`, `2e28d06eb6`, `102a389b0b`) captured genuine build-vs-validate state-root divergence in CI logs (completely different roots, not the 1-byte test-framework flip seen in Invalid* tests). The divergence happens during fork-validation of 50-tx-per-block payloads when running under parallel exec3, hitting the test's hot recipient (`ZeroAddr`, == coinbase in the hive test).

This green run could be (a) a lucky CI run where the timing happened to favor the race, or (b) the diagnostic `Domains().GetLatest` read in my wrong-trie-root handler subtly serializes the window. Empty re-trigger `67f4330268` will gather more data. Tracking memo: `project_hive_reorg_bug_2026-06-02.md`.

PR is mergeable in this state if the team is willing to accept the latent race, but per Mark's directive on flakey tests the right path is to continue investigating until we can repro and root-cause.

Copy link
Copy Markdown
Member

@yperbasis yperbasis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is still red. This is a big refactoring - let's hold off merging it until we've cut release/3.5.

@yperbasis yperbasis added this to the 3.6.0 milestone Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants