Listen to the beacon node SSE endpoint and expose via prom metrics #3580

OisinKyne · 2025-03-10T13:20:22Z

🎯 Problem to be solved

During the holesky fork incident, we did not have much visibility into what chain each beacon node was on. We can improve this observability.

We also hit the BN with many requests every slot, we could potentially change charon's behaviour to be more efficient by leveraging the beacon node server sent events feature.

🛠️ Proposed solution

I believe we can progress both of these problems, with a low impact feature which listens to BN events and debug logs + monitors them.

First we should create a design doc, outlining exactly what events we will listen to, what we log, and what of them we expose to monitoring and why. We should consider scraping every type of BN for the standard-ish metrics in every config charon is deployed in, and we should consider a charon-based client, that either outbounds requests to BNs, or subscribes to the SSE endpoint.

We Register a handler to the BN SSE endpoints.

When events come in. Debug log them, and update prom gauges as appropriate.

Create grafana panels that display this info.

Approved design doc: link
Core team consensus on the proposed solution

🧪 Tests

Tested by new automated unit/integration/smoke tests
Manually tested on core team/canary/test clusters
Manually tested on local compose simnet

👐 Additional acceptance criteria

We can plan a future feature where we share something like the parentRoot observed on peerInfo, such that we can warn if it looks like peers are on different forks (we can spot this on our central prom before shipping this).

❌ Out of Scope

Making changes to scheduler or triggering retries etc if we detect a re-org. (Though we should plan to make optimisations here).

Pushing this code into an eth2-client library/package (though this would be a good candidate for something to abstract into a package early in the development of one)

OisinKyne added this to the v1.4.0 milestone Mar 10, 2025

github-actions bot added the protocol Protocol Team tickets label Mar 10, 2025

KaloyanTanev added the needs refining Solution is unclear and needs refining label Mar 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Listen to the beacon node SSE endpoint and expose via prom metrics #3580

Listen to the beacon node SSE endpoint and expose via prom metrics #3580

OisinKyne commented Mar 10, 2025 •

edited

Loading

Listen to the beacon node SSE endpoint and expose via prom metrics #3580

Listen to the beacon node SSE endpoint and expose via prom metrics #3580

Comments

OisinKyne commented Mar 10, 2025 • edited Loading

🎯 Problem to be solved

🛠️ Proposed solution

🧪 Tests

👐 Additional acceptance criteria

❌ Out of Scope

OisinKyne commented Mar 10, 2025 •

edited

Loading