feat(chain)!: taint-aware `CanonicalView::balance` via canonical ancestor walk by evanlinjin · Pull Request #2235 · bitcoindevkit/bdk

evanlinjin · 2026-06-23T10:39:53Z

Warning

This PR — including this description — is currently entirely AI-generated.
It still needs proper human review (by me, @evanlinjin) before it should be taken seriously. Treat everything below as a draft proposal, not a vetted design.

Description

This is an alternative direction to #2221. Where #2221 makes CanonicalView::balance less restrictive by removing the min_confirmations parameter, this PR instead tries to make it more useful — handing control to the caller via closures, deriving trust from ancestry rather than a per-script heuristic, and exposing a per-output spend-eligibility classifier that's the building block for coin control.

The core new primitive is CanonicalView::classify_outpoints, built on a reusable, sans-TxGraph canonical ancestor walk. balance becomes a thin fold over it.

What's here (commit by commit)

1. Add CanonicalAncestors reverse-topological walk. Canonical::ancestors(seeds, map, should_walk) walks the canonical ancestors of a set of seed txids backwards (the seeds are the leaf-most txs, closer to the leaves of the ancestry DAG than its roots). Each tx is visited once in reverse-topological order, folding a Mergeable accumulator from descendants into ancestors (diamonds merge correctly). should_walk prunes per-tx with the ChainPosition available; the iterator is ExactSizeIterator.

2. Add Canonical::ancestors_inclusive. A variant that also yields the seeds (each with its own final accumulator), so a per-seed contribution can be folded uniformly with the ancestry.

3. Taint-aware CanonicalView::balance (breaking). Replaces trust_predicate + min_confirmations with two closures:

does_taint(CanonicalTx) -> bool: a pending output is untrusted if it, or any of its unsettled ancestors, taints. The unsettled ancestry is walked once (deduped across outputs, stopping at settled txs) via ancestors_inclusive. This lets callers demote unconfirmed coins received from — or chained on top of — a third party. Trust is now derived from ancestry instead of a per-script heuristic.
is_settled(&ChainPosition) -> bool: generalizes the numeric min_confirmations into a caller-defined "confident this won't be replaced" boundary, and is the sole authority on it — a settled output is never dropped or tainted (even an unconfirmed output a caller chooses to treat as settled is counted, not lost).
balance also drops its O identifier generic and now takes bare OutPoints.

4. Rename Balance::confirmed to Balance::settled (breaking). The bucket is driven by is_settled (confidence a tx won't be replaced), not strictly confirmation status, so the field is renamed to match. Breaking serde key change (confirmed → settled, no alias).

5. Add classify_outpoints spend-eligibility classifier. CanonicalView::classify_outpoints pairs each unspent output with a chain-level Eligibility (Settled / Immature / TrustedPending / UntrustedPending) from is_settled + the does_taint walk. This is the primitive for coin selection / coin control (e.g. "prefer settled coins, fall back to trusted-pending"): the caller decides how to aggregate and can layer wallet-specific categories like "locked" on top, instead of being constrained to the fixed Balance buckets. balance is re-expressed as a thin fold over it.

Direction / open questions for human review

The longer-term idea is that classify_outpoints is the real API and Balance is just one possible fold — wallets that want categories like "locked"/"reserved" aggregate themselves. This PR keeps Balance (as a fold) for now; a follow-up could move it down into bdk_wallet and drop it from bdk_chain.
Is replacing trust_predicate with ancestry-based does_taint the right call, or should both coexist? (does_taint can't express per-output trust, but per-tx ancestry is arguably the more principled model.)
Balance::confirmed → Balance::settled: worth the rename / serde break, or keep the conventional name?
bdk_wallet will need its balance call updated (out of this repo's tree).

Notes to reviewers

Tests: ancestor-walk tests (diamond merge / reverse-topo order / pruning / exact size), a balance + classify_outpoints ancestry-taint test, and a regression test that a settled-everything is_settled never drops an unconfirmed output. Existing balance tests were migrated (the old per-keychain trust test is recomputed for the new model, since none of its txs actually spend third-party coins).
cargo fmt, clippy --all-features --all-targets -p bdk_chain, and RUSTDOCFLAGS="-D warnings" cargo doc are clean; the bdk_chain suite + doctests pass.

Changelog notice

Changed

CanonicalView::balance now takes does_taint and is_settled closures instead of trust_predicate and min_confirmations, and accepts bare OutPoints.
Renamed Balance::confirmed to Balance::settled.

Added

CanonicalView::classify_outpoints + Eligibility: per-output spend-eligibility classification (the building block for coin control).
Canonical::ancestors / Canonical::ancestors_inclusive: reverse-topological canonical ancestor walk with a Mergeable accumulator.

🤖 Generated with Claude Code

codecov · 2026-06-23T10:46:10Z

Codecov Report

❌ Patch coverage is 92.52336% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.86%. Comparing base (6d03fc3) to head (71c35d9).

Files with missing lines	Patch %	Lines
crates/chain/src/canonical.rs	87.65%	10 Missing ⚠️
crates/chain/src/balance.rs	0.00%	3 Missing ⚠️
crates/chain/src/canonical_ancestors.rs	97.69%	1 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2235      +/-   ##
==========================================
+ Coverage   78.65%   78.86%   +0.20%     
==========================================
  Files          30       31       +1     
  Lines        5909     6076     +167     
  Branches      279      285       +6     
==========================================
+ Hits         4648     4792     +144     
- Misses       1185     1209      +24     
+ Partials       76       75       -1

Flag	Coverage Δ
rust	`78.86% <92.52%> (+0.20%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add `Canonical::ancestors`, returning a `CanonicalAncestors` iterator that walks the canonical ancestors of a set of seed transactions. - Ancestors are yielded in reverse topological order (descendants before the ancestors they spend) and each transaction is visited exactly once. - Accumulation is a fold over the ancestor DAG: `map` computes each tx's own contribution from its `CanonicalTx`, and a tx's final accumulator is its contribution `Merge`d with the accumulators of every in-set descendant. - `should_walk` prunes the walk per-tx, with the chain position available so callers can decide cutoffs (e.g. stop at confirmed). Seeds are given as txids (looked up in the canonical set) and seed the accumulation without being yielded. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add a variant of `Canonical::ancestors` that also yields the seed transactions (each with its own final accumulator), not just their ancestors. Pruning, deduplication and the reverse-topological order are unchanged; the seeds are emitted alongside the walked ancestors. Useful when the per-seed contribution must be folded uniformly with the ancestry (e.g. a seed that taints itself), avoiding a separate pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Replace `balance`'s `trust_predicate` and `min_confirmations` parameters with two closures: * `does_taint(CanonicalTx) -> bool` decides whether a transaction taints its descendants. A pending output is `untrusted_pending` if it, or any of its unsettled ancestors, taints; otherwise `trusted_pending`. The unsettled ancestry of each pending output is walked via `ancestors_inclusive` (deduped across outputs, stopping at settled transactions). This lets callers demote unconfirmed coins received from (or chained on top of) a third party. * `is_settled(&ChainPosition) -> bool` decides the confirmed/pending boundary, generalizing the old numeric `min_confirmations` (e.g. it can treat shallowly-confirmed outputs as pending and taintable). `does_taint` replaces the per-spk `trust_predicate`: trust is now derived from ancestry rather than a per-script heuristic. Call sites and tests are updated; the old per-keychain trust test is recomputed for the new model. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The balance "confirmed" bucket is now driven by the caller-supplied `is_settled` predicate (transactions we are confident will not be replaced), not strictly by confirmation status. Rename the field to match the concept and update `Display`, `Add`, `total`, `trusted_spendable` and all call sites. This is a breaking change: the serde key changes from "confirmed" to "settled" (no alias). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add `CanonicalView::classify_outpoints`, which pairs each unspent output with a chain-level `Eligibility` (`Settled` / `Immature` / `TrustedPending` / `UntrustedPending`) computed from `is_settled` and the `does_taint` ancestry walk. This is the primitive for coin selection / coin control: the caller decides how to aggregate, and can layer wallet-specific categories (e.g. "locked") on top — rather than being constrained to the fixed `Balance` buckets. `CanonicalView::balance` is re-expressed as a thin fold over `classify_outpoints`, adding each output's value to the `Balance` bucket that matches its `Eligibility`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… walk Close review-identified coverage gaps: - `test_balance_taint_stops_at_settled_ancestor`: a settled ancestor that `does_taint` would flag must not taint its unsettled descendant — isolates the `!is_settled` guard in the taint walk. - `test_classify_immature_and_settled`: `classify_outpoints` returns `Immature` for an immature coinbase and `Settled` for a mature output. - `ancestors_inclusive_yields_seeds`: the inclusive walk yields the seeds (with their own accumulators) and `len()` accounts for them. - `ancestors_multiple_seeds_dedup_shared_ancestor`: an ancestor shared by two seeds is visited once. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

evanlinjin · 2026-06-27T23:56:04Z

@oleonardolima It will be ideal if this can reuse logic from #2219.

Dmenec

Hey @evanlinjin thank you for pushing this.

As there are many things involved, I tried to go point-by-point on some thoughts I have in here.

Related to your questions:

The longer-term idea is that classify_outpoints is the real API and Balance is just one possible fold — wallets that want categories like "locked"/"reserved" aggregate themselves. This PR keeps Balance (as a fold) for now; a follow-up could move it down into bdk_wallet and drop it from bdk_chain.

Agree, classify_outpoints + Eligibility is a good primitive, Balance is just a convenience fold on top so it makes sense to me. I think moving Balance out of bdk_chain would be a breaking change for consumers calling it directly, so maybe the migration needs care.

Is replacing trust_predicate with ancestry-based does_taint the right call, or should both coexist? (does_taint can't express per-output trust, but per-tx ancestry is arguably the more principled model.)

As said in one comment below, trust_predicate had this identifier that provided specific assumptions per-output. I don't think that's a problem though, that policy was always wallet-specific. It makes more sense for bdk_wallet to handle it externally.

bdk_wallet will need its balance call updated (out of this repo's tree).

This is the part I'm least sure about, so I'll just lay out what I found and leave it open.

While looking into the walk I tried a different approach: a per-UTXO memoized DFS (one bool cached per tx, shared across UTXOs so each ancestor is visited once) instead of the topological accumulator pass. Same does_taint/is_settled contract, and I checked it gives the same Eligibility as classify_outpoints on a diamond. Benchmarked both across a few topologies:

topology	size	memoized	classify_outpoints	speedup
fan-in	10w×3d	9.0 µs	21.9 µs	2.4x
fan-in	50w×5d	77.7 µs	192.9 µs	2.5x
fan-in	100w×10d	300 µs	878 µs	2.9x
fan-in	500w×20d	3.3 ms	14.5 ms	4.4x
disjoint	10w×3d	9.9 µs	24.4 µs	2.5x
disjoint	50w×5d	83.9 µs	194.6 µs	2.3x
disjoint	100w×10d	311 µs	842 µs	2.7x
disjoint	500w×20d	3.3 ms	14.0 ms	4.2x
untrusted-fan-in	10w×3d	8.1 µs	24.2 µs	3.0x
untrusted-fan-in	50w×5d	73.2 µs	209.7 µs	2.9x
untrusted-fan-in	100w×10d	295 µs	940 µs	3.2x
untrusted-fan-in	500w×20d	3.2 ms	16.3 ms	5.1x
diamond	5s×20u	13.5 µs	39.2 µs	2.9x
diamond	10s×50u	49.2 µs	169.1 µs	3.4x
diamond	20s×100u	158 µs	627 µs	4.0x
diamond	50s×500u	1.8 ms	8.3 ms	4.6x

(w = number of parallel chains, d = depth of each chain; for diamond s = shared ancestor txs, u = UTXOs spending all of them.)

Both walk the same O(U+S) txs but the BTreeSet accumulator grows with the number of downstream UTXOs, while the memoized walk just stores one bool per tx and stops early once does_taint fires.

What I genuinely can't decide is where this belongs. It doesn't use CanonicalAncestors at all, just the public tx/txout, so it feels a bit out of place sitting next to classify_outpoints in bdk_chain, not really taking advantage of the ancestor-walk machinery this PR adds. It could just as easily live in bdk_wallet. But then direct bdk_chain consumers of balance() wouldn't benefit.

I'm not sure if it makes sense here, wallet-side, or not at all, some thoughts?

Code + bench: https://github.com/Dmenec/bdk/tree/bench/trust-classification

Dmenec · 2026-06-30T21:34:18Z

-    /// Confirmed and immediately spendable balance
-    pub confirmed: Amount,
+    /// Settled balance: outputs whose transaction we are confident will not be replaced (e.g.
+    /// confirmed deeply enough), as determined by the balance query's `is_settled` predicate.
+    pub settled: Amount,


Agree, settled better reflects that the threshold is a wallet policy decision.

Dmenec · 2026-06-30T22:02:15Z

+    /// Pending (not settled), and neither it nor any of its unsettled ancestors taints it.
+    TrustedPending,
+    /// Pending (not settled), and it or one of its unsettled ancestors taints it.
+    UntrustedPending,


I like the taint concept more than saying some previous transaction is untrusted, what's actually untrusted is the pending UTXO, not the ancestor.

Dmenec · 2026-06-30T22:10:15Z

+    pub fn classify_outpoints(
+        &self,
+        outpoints: impl IntoIterator<Item = OutPoint>,
+        mut does_taint: impl FnMut(CanonicalTx<ChainPosition<A>>) -> bool,


trust_predicate let callers express per-output trust by keychain. does_taint only sees the ancestor transaction, which does not allow for a "blind trust assumption" (i.e. this output belongs to my keychain so trust it).

Not sure if this policy now has to be layered on top of the Eligibility results.

Dmenec · 2026-06-30T22:19:13Z

+            for (descendants, c_tx) in self.ancestors_inclusive::<BTreeSet<Txid>, _, _>(
+                seeds.iter().copied(),
+                |c_tx| core::iter::once(c_tx.txid).collect(),
+                |c_tx| !is_settled(&c_tx.pos),
+            ) {


does_taint is called eagerly for every unsettled ancestor here and before the returned iterator is built. So if a caller provides an expensive closure, we'd have to evaluate it against every walked ancestor upfront, even if we only end up needing a small part of the UTXOs.

Dmenec · 2026-06-30T22:24:56Z

+/// Builds a diamond of confirmed transactions and returns the chain, graph and the four txs.
+///
+/// ```text
+///        tx_a            (2 outputs)
+///       /    \
+///    tx_b    tx_c
+///       \    /
+///        tx_d            (root we walk ancestors from)
+/// ```


Maybe checking with different topolgies would be useful to cover.

For instance: Disjoint seeds with no shared ancestry or a seed that is itself an ancestor of another seed.

github-project-automation Bot added this to BDK Chain Jun 23, 2026

evanlinjin force-pushed the feat/canonical-ancestors branch 2 times, most recently from 1fb7787 to 0a9fa74 Compare June 24, 2026 03:11

evanlinjin and others added 4 commits June 24, 2026 03:27

evanlinjin force-pushed the feat/canonical-ancestors branch from 0a9fa74 to 7737e30 Compare June 24, 2026 03:27

evanlinjin mentioned this pull request Jun 24, 2026

refactor(chain)!: remove min_confirmations parameter from CanonicalView::balance #2221

Closed

2 tasks

evanlinjin force-pushed the feat/canonical-ancestors branch from a286ab4 to ab0c5a2 Compare June 24, 2026 05:01

evanlinjin force-pushed the feat/canonical-ancestors branch from c9c6f24 to 71c35d9 Compare June 25, 2026 11:37

Dmenec reviewed Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(chain)!: taint-aware `CanonicalView::balance` via canonical ancestor walk#2235

feat(chain)!: taint-aware `CanonicalView::balance` via canonical ancestor walk#2235
evanlinjin wants to merge 6 commits into
bitcoindevkit:masterfrom
evanlinjin:feat/canonical-ancestors

evanlinjin commented Jun 23, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

evanlinjin commented Jun 27, 2026

Uh oh!

Dmenec left a comment

Uh oh!

Dmenec Jun 30, 2026

Uh oh!

Dmenec Jun 30, 2026

Uh oh!

Dmenec Jun 30, 2026

Uh oh!

Dmenec Jun 30, 2026

Uh oh!

Dmenec Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

evanlinjin commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What's here (commit by commit)

Direction / open questions for human review

Notes to reviewers

Changelog notice

Changed

Added

Uh oh!

codecov Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

evanlinjin commented Jun 27, 2026

Uh oh!

Dmenec left a comment

Choose a reason for hiding this comment

Uh oh!

Dmenec Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Dmenec Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Dmenec Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Dmenec Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Dmenec Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

evanlinjin commented Jun 23, 2026 •

edited

Loading

codecov Bot commented Jun 23, 2026 •

edited

Loading