Refactor get_validator_blocks_v3 fallback #8186
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Addressed
#7727 introduced a bug in the logging, where as long as the node failed the SSZ
get_validator_blocks_v3
endpoint, it would log asBeacon node does not support...
. However, the failure can be due to other reasons, such as a timed out error as found by @jimmygchen:WARN Beacon node does not support SSZ in block production, falling back to JSON slot: 5283379, error: HttpClient(url: https://ho-h-bn-cowl.spesi.io:15052/, kind: timeout, detail: operation timed out
Proposed Changes
This PR made the error log more generic, so there is less confusion.
Additionally, suggested by @michaelsproul, this PR refactors the
get_validator_blocks_v3
calls by trying all beacon nodes using the SSZ endpoint first, and if all beacon node fails the SSZ endpoint, only then fallback to JSON.It changes the logic from:
"SSZ -> JSON for primary beacon node, followed by SSZ -> JSON for second beacon node and so on" to
"SSZ for all beacon nodes -> JSON for all beacon nodes"
This has the advantage that if the primary beacon node is having issues and failed the SSZ, we avoid retrying the primary beacon node again on JSON (as it could be that the primary beacon node fail again); rather, we switch to the second beacon node.
Additional Info
As the calling of
get_validator_blocks_v3
(and it's SSZ counterpart) is shifted to another function, the remaining part in the functionget_validator_block
is a bit of dangling in the air, so I moved them all to the now renamedget_validator_block_and_publish_block
function and delete the originalget_validator_block
function.