Skip to content

Commit 626e0ea

Browse files
authored
Performance and Tracing update 2025-02-03 (#535)
1 parent f6eb5fd commit 626e0ea

File tree

1 file changed

+71
-0
lines changed

1 file changed

+71
-0
lines changed
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
---
2+
title: Performance & Tracing Update
3+
slug: 2025-02-03-performance-and-tracing
4+
authors: mgmeier
5+
tags: [performance-tracing]
6+
hide_table_of_contents: false
7+
---
8+
9+
## High level summary
10+
11+
* Benchmarking: Release benchmarks and performance baselines on `10.2` for UTxO-HD, new GHC, Genesis; 'Perdiodic tracer' benchmarks.
12+
* Development: Pervasive thread labeling in the Node; fix a race condition in monitoring dependency `ekg-wai`.
13+
* Infrastructure: Haskell profile definition work passed testing, ready for merge; continued 'Byron' support in our tooling.
14+
* Tracing: C library for trace forwarding reached prototype stage; last batch of documentation updates ready for publication.
15+
* Community: Support and valuable feedback on Discord for new tracing system rollout.
16+
17+
## Low level overview
18+
19+
20+
### Benchmarking
21+
22+
We've performed a full set of release benchmarks and analyses for Node version `10.2`. We could not detect any performance risks, and expect network performance to be equivalent or slightly better
23+
than `10.1.x` releases, albeit using slightly more CPU resources under rare conditions.
24+
25+
Furthermore, we're building several performance baselines with `10.2` to compare future changes, features or node flavours to. For comparative benchmarks, it's vital every change be measured individually, as to
26+
be able to discern their individual performance impact. For Node `10.3`, there are several of those we want to capture, such as crypto class simplifications in Ledger, UTxO-HD with a new in-memory backend,
27+
Ouroboros Genesis, and last not least a new GHC9.6 release addressing a remaining performance blocker when building Cardano.
28+
29+
Additionally, we've validated the 'Periodic tracer' feature on cluster benchmarks and now have evidence of its positive impact on performance. This feature decorrelates gathering metrics from the ledger
30+
from the start of a block producer's forging loop, without sacrificing predictability of performance. By removing this competition on certain synchronization primitives, the hot code path in the forging
31+
loop now executes faster. The feature will be integrated in a future version of the Node.
32+
33+
### Development
34+
35+
We've tracked down a race condition in a community package that both tracing systems depend on for exposing metrics. In `ekg-wai`, a `ThreadKilled` exception could be re-thrown to the thread where
36+
it originated from. It is a low-risk condition, as it occurs only when then Node process terminates; however, when terminating due to an error condition, it caused the process to end prematurely, before the
37+
error could be logged. We've opened a [PR (ekg-wai#12)](https://github.com/tvh/ekg-wai/pull/12) against the package containing the fix and pre-released on CHaP.
38+
39+
Tracking down this condition could have been improved by providing pervasive, human-readable labels for all the threads that the Node process spawns. So in coordination with the Consensus team,
40+
we made sure this is the case for future builds of the Node - including locations in the code where dependency packages internally use `forkIO` to create green threads. This will
41+
enhance usability of debug output when looking into concurrency issues.
42+
43+
### Infrastructure
44+
45+
The Haskell definition of benchmarking workloads - and the removal of its `bash`/`jq` counterpart - is complete, and has passed testing phase. This includes a final alignment between all profile content
46+
defined using either option. Once merged, this will open up the path for simplification of how `nix` interacts with the performance `workbench` - and hopefully reduce complexity for our CI runners.
47+
48+
As `cardano-api` is deprecating some protocol parameter related data types which do not have relevance for Cardano anymore, we've had a discussion with stakeholders about the implications for our tooling:
49+
This would effectively disable our ability to benchmark clusters of BFT nodes which do not use a staking / reward-based consensus algorithm - as it used to be in Cardano's Byron era. The decision
50+
was made to not drop that ability from our tooling, as there are potential applications for the benchmarks outside of Cardano. As a consequence, we've startied porting those types to live on in our toolchain,
51+
representing an additonal maintenance item within our team.
52+
53+
54+
### Tracing
55+
56+
The self-contained C library implementing trace forwarding is now in prototype state. It contains a pure C implementation of our forwarding protocol over socket,
57+
as well as pure C CBOR codecs for data payload to match the `TraceObject` schema used within the context Cardano. That ensures existing tooling can process traces emitted
58+
by non-Cardano applications, written in languages other than Haskell.
59+
60+
The latest updates to [Developer Portal: `cardano-tracer`] are ready to be published and awaiting a PR review on the Cardano Developer Portal.
61+
62+
### Community
63+
64+
We've been quite busy on our new Discord channel [_#tracing-monitoring_](https://discord.com/channels/826816523368005654/1332375957528514590) on the *IOG's Technical Community* server. There's been
65+
an initial spike of interest and we've been able to provide support and explain various design decisions of the new tracing system. Additionally, we've gotten valuable feedback about potential
66+
features that would greatly help adoption of the new system. These are typically highly localized in their implementation, and non-breaking wrt. to API and design, such that addressing this
67+
feedback promptly adds much value at low risk - Thank You for your input!
68+
69+
70+
71+
[Developer Portal: `cardano-tracer`]: https://developers.cardano.org/docs/get-started/cardano-node/new-tracing-system/cardano-tracer

0 commit comments

Comments
 (0)