Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: attempt to use upstream snap sync #771

Draft
wants to merge 32 commits into
base: libevm
Choose a base branch
from
Draft

Conversation

darioush
Copy link
Collaborator

@darioush darioush commented Jan 29, 2025

  • eth/protocols/snap: This is mostly the eth code, with some minor changes

    • some imports were modified to have an eth prefix eg, "ethstate, ethrawdb". This is to silence the linter, but I will fix this to reduce the diff with upstream, so you can visualize the diff more easily
    • sync.go: this file is the "client" portion of the sync and is mostly untouched, except for omtting this check:
    • rawdb.HasTrieNode(s.db, res.hashes[i], nil, account.Root, s.scheme)
    • This is because the sync process also populates the snapshot, so if 2 accounts share the same storage root (for example, are two small tries with a couple key values that happen to match) the snapshot for one account will be missing if this check is in place. This only applies to the hash scheme, but I did it so we can keep it running as is for now.
    • handler.go: I modified ServiceGetAccountRangeQuery and ServiceGetStorageRangesQuery to "patch in" the existing mechanism for responding to state sync requests, just to get the key/values. We need this because in upstream, there are S snapshot layers kept in memory (I think S=128), and then the disk layer matches head - S + 1 or something of that sort. In coreth/subnet-evm, the disk snapshot matches the last accepted block. We may want to match upstream behavior here, so this is another "chapter" of this project.
    • So, for now I replaced how the server gets the key/values in upstream:
    • it, err := chain.Snapshots().AccountIterator(req.Root, req.Origin, false) with a call to how we do it now:
    • handler := handlers.NewResponseBuilder(
      	leafsRequest, leafsResponse, tr, chain.Snapshots(), stateKeyLength, maxLeavesLimit, handlerStats,
      )```
      
    • To enable this, I added NewResponseBuilder to sync/handlers/leafs_request.go, as the responseBuilder is unexported without this change.
    • This way we get the benefit of the optimistic snapshot read and only fall back to iterating the trie when the snapshot data is actually overwritten, plus it is the existing behavior.
    • I'm still using the upstream code to generate the proofs, even though we should be able to use the one from the responseBuilder if we wanted, feel free to experiment.
    • I reduced the soft limit of the response size from 2 MB to 1 MB but I'm not sure of the actual p2p message limit, better confirm this with those more familiar with the networking stack.
  • For proof of concept we can keep the code copied in coreth, but for productionizing it ideally the eth/protocols/snap package is used from libevm.

  • Glue code: To connect the avalanchego p2p layer to the eth sync protocol, I used some basic shim layers for the client in plugin/evm/statesync/connector.go and for the handler in plugin/evm/statesync/handler.go. These contain panics on unexpeted code paths, but we should handle all failures correctly especially those that can be triggered by incoming peer messages.

  • Modifications to existing state sync handler:

    • I set the More flag on the response as a hint to the code in handler.go as to whether the response needs a proof. This helps emulate the abort variable behavior in case there is a partial storage response.
    • It seems the responses should be at least 1 key long, otherwise they will not pass the range proof verification. This seems like a (minor) bug in existing code.
  • Putting it all together:

    • added a config option state-sync-use-upstream to plugin/evm/config/config.go
    • register the new handler type with vm.Network in vm.go
    • Pass some additional options to the state sync client in vm.go
    • in plugin/evm/syncervm_client.go:
      • if the useUpstream option is set, replaces the step that syncs eth trie client.syncStateTrie with the new syncer.Sync
      • if stateSyncNodes is set, registers the explicitly specified peers to sync from to the syncer,
      • if it is not set, it configures the syncer as a listener to register and unregister peers as they connect/disconnect to the node.

Note when stateSyncNodes is explicitly set (in the config.json), they should also be passed to the avalanchego using the --state-sync-ids and --state-sync-ips options, to also use those nodes to determine the "sync summary" i.e., where to start syncing from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant