Skip to content

Cache computed target build graphs across repeated builds#1124

Open
jverkoey wants to merge 7 commits into
swiftlang:mainfrom
ClutchEngineering:clutch/target-build-graph-cache
Open

Cache computed target build graphs across repeated builds#1124
jverkoey wants to merge 7 commits into
swiftlang:mainfrom
ClutchEngineering:clutch/target-build-graph-cache

Conversation

@jverkoey
Copy link
Copy Markdown
Contributor

Problem

TargetBuildGraph.init calls computeGraph() on every build, even when nothing in
the workspace has changed. For large workspaces this is expensive — in our monorepo
(2100+ targets) Instruments traces showed computeGraph() taking 7–37 seconds per
call
, with 83% of the time spent in the recursive addDependencies traversal.

Xcode also issues multiple graph requests per build action with different
parameters (dependency graph vs actual build, index preparation vs normal build),
compounding the cost. In our workspace each no-change build was spending ~42 seconds
in computeGraph() across all requests.

Solution

Add a process-level multi-entry cache (TargetBuildGraphCache) that stores computed
graph topologies keyed by a signature of:

  • Normalized PIF workspace signature (stripping volatile subobject GUIDs)
  • Build parameters and per-target parameter overrides
  • Topology-affecting flags (useImplicitDependencies, useParallelTargets,
    skipDependencies, dependencyScope, buildCommand)
  • Graph purpose (.build vs .dependencyGraph)

The cache is static (process-level) because WorkspaceContext is recreated on every
PIF transfer, even when nothing has changed. Multiple entries (up to 8) are stored so
that Xcode's different request types don't evict each other.

On cache hit, unapproved target diagnostics are re-emitted to preserve correctness.

Measurements

Tested on a 2100+ target monorepo (5159 configured targets in the full graph).
Each no-change build triggers two graph requests:

Request type Targets Without cache With cache
Dependency graph 1,654 6.4–7.6s 0s (cache hit)
Build / index prep 5,159 34.6–37.5s 0s (cache hit)
Total per build ~42s ~0s

The first build after launch populates the cache (cold miss); every subsequent
no-change build hits. The cache is automatically invalidated when the PIF workspace
signature changes (i.e. when the project structure actually changes).

Copy link
Copy Markdown
Collaborator

@owenv owenv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may eventually want to fold more of this into the build description itself, but I think an in-memory cache of the TargetBuildGraph makes sense. However, I think some more preparatory refactoring might be needed to help ensure the caching is correct, I left more detailed comments below. I think we'll also need to add some tests for this functionality

Comment thread Sources/SWBCore/TargetBuildGraphCache.swift Outdated
/// index preparation vs normal build). A multi-entry cache ensures
/// these don't evict each other.
///
/// The cache is static (process-level) because `WorkspaceContext` is
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All caches should be at least Session-scoped because in the context of an IDE/BSP/etc. the build service may regularly open and close unrelated sessions. Instead of working around the WorkspaceContext recreation, we should look at being less aggressive about tearing it down unnecessarily if we're going to do more caching at this layer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The cached Target objects use reference identity, so serving them across sessions with different WorkspaceContexts would be incorrect. In practice cross-session signature collisions are unlikely (PIF signatures include all target GUIDs and project paths), but it's not safe in general.

Deferred for now since it requires threading a cache instance through the call chain (PlanningOperation, DependencyGraphMessages, CleanOperation, etc). Session seems like the right owner since it already manages BuildDescriptionManager. Open to guidance on where you'd want this to live.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more concerned about ensuring cached dependency graphs don't persist in memory when a session is closed but the process is continuing to run builds for other sessions.

public enum TargetBuildGraphCache {
/// The data we cache — everything needed by the
/// `TargetBuildGraph` memberwise init except the live context
/// objects (workspaceContext, buildRequest, buildRequestContext).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is incorrectly referring to the build request as context, we should consider it a primary input to the computation

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the resolver reads inputs through context objects that aren't reflected in the signature. Deferred since the fix requires refactoring the resolver to declare explicit inputs. Low practical risk within a single Xcode session but not correct in general.

Comment thread Sources/SWBCore/TargetBuildGraphCache.swift Outdated
Comment thread Sources/SWBCore/TargetBuildGraphCache.swift
case .dependencyGraph:
hasher.combine("depgraph")
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cache key computation is missing at least a few important components, like the signature of xcconfig files loaded while computing settings in the dependency resolver, user preferences, etc.

I think I'd recomment first refactoring the dependency resolver to eliminate direct access to the buildRequestContext and workspaceContext in favor of finer-grained inputs. This should make it easier to audit the inputs they're indirectly introducing today and ensure the cache key is complete.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the resolver reads inputs through context objects that aren't reflected in the signature. Deferred since the fix requires refactoring the resolver to declare explicit inputs. Low practical risk within a single Xcode session but not correct in general.

Comment thread Sources/SWBCore/TargetDependencyResolver.swift Outdated
Comment thread Sources/SWBCore/TargetBuildGraphCache.swift Outdated
Comment thread Sources/SWBCore/TargetBuildGraphCache.swift Outdated
Comment thread Sources/SWBCore/TargetDependencyResolver.swift Outdated
@jverkoey
Copy link
Copy Markdown
Contributor Author

Thank you for the feedback! While I'm working on the feedback I shared #1111 (comment) which includes a few Instruments runs that might be of use for understanding the dragon I'm trying to slay here..

Xcode issues multiple TargetBuildGraph requests per build action with
different parameters (dependency graph vs build, index prep vs normal).
For large workspaces (2000+ targets), computeGraph() takes 7-37s per
call. Without caching, this cost is paid on every build even when
nothing has changed.

Add a process-level multi-entry cache keyed by a signature of the
normalized PIF workspace signature, build parameters, and request
flags. Each distinct request type gets its own cache slot (up to 8
entries). The cache is static because WorkspaceContext is recreated
on every PIF transfer even when nothing has changed.

On cache hit, unapproved target diagnostics are re-emitted to
preserve correctness.
- Rename CachedTopology → CachedDependencyGraph
- Remove leading underscores from private properties
- Replace full-reset eviction with LRU (track lastAccess per entry)
- Skip caching for prepareForIndexing (low-QoS, large, rarely reused)
- Cache all diagnostics during resolution via DiagnosticCollectingDelegate,
  re-emit on cache hit. Don't cache graphs where errors were emitted.
- Make DependencyScope and Purpose Hashable, replacing inline switches
- Don't sort build targets in signature (preserve input order)
- Don't normalize workspace signature (use as-is for reference safety)
- Convert caching init to static factory: TargetBuildGraph.cached(...)
@jverkoey jverkoey force-pushed the clutch/target-build-graph-cache branch from 6245fe9 to 59f080e Compare February 26, 2026 04:21
@jverkoey
Copy link
Copy Markdown
Contributor Author

All feedback addressed in the latest force-push. Summary:

Implemented:

  • LRU eviction (tracks lastAccess per entry)
  • QoS-aware caching (skip prepareForIndexing for both lookup and store)
  • Preserve build target order (removed .sorted(by:))
  • DependencyScope and Purpose conform to Hashable
  • DiagnosticCollectingDelegate collects all diagnostics during resolution, re-emits on cache hit, skips caching graphs with errors
  • Use workspaceSignature as-is (no normalization)
  • Renamed CachedTopologyCachedDependencyGraph
  • Removed leading underscores (entries, accessCounter)
  • Caching initTargetBuildGraph.cached(...) static factory
  • Copyright year updated to 2026 on new files

Remaining work (deferred):

  1. Session-scoped cache: The cache is still process-wide (static). Scoping to Session requires threading a cache instance through the call chain. Session seems like the right owner since it already manages BuildDescriptionManager. Open to guidance on the preferred approach.
  2. Complete cache key: The signature doesn't capture xcconfig file contents or user preferences accessed indirectly through context objects. Fixing this requires refactoring the resolver to declare explicit inputs rather than reaching into buildRequestContext/workspaceContext. Low practical risk within a single session, but not correct in general.

Happy to tackle either of these in a follow-up or in this PR if you'd prefer.

@jverkoey
Copy link
Copy Markdown
Contributor Author

jverkoey commented Feb 26, 2026

Swift Build performance metrics comparisons between my mainline build and this PR:

Screenshot 2026-02-26 at 12 55 03 AM

Note that the number of builds is relatively small because I just got this infra up and running. Hoping to have a bigger dataset over time but I can confirm on local builds that I'm getting significantly faster builds with this change.

@jverkoey jverkoey marked this pull request as draft February 26, 2026 22:33
@jverkoey
Copy link
Copy Markdown
Contributor Author

Moving this to a draft state because I discovered a critical bug in this implementation where changing a file doesn't result in a new build being kicked off.

owenv and others added 2 commits February 26, 2026 19:28
The cached dependency graph stores live ConfiguredTarget objects whose
Target references use reference identity (ObjectIdentifier) for
hash/equality. When the PIF is re-transferred, IncrementalPIFLoader
may create new Target objects even when the content is unchanged. The
workspace signature (content-based) stays the same, causing a cache
hit that serves stale Target references — downstream dictionary lookups
using these references silently fail, skipping recompilation of changed
source files.

Fix: include ObjectIdentifier(workspaceContext.workspace) in the cache
signature. Same Workspace instance (reused within a session) produces
a cache hit. New Workspace instance (PIF re-transferred) produces a
cache miss, which is correct because Target references are tied to the
Workspace instance lifetime.
@jverkoey
Copy link
Copy Markdown
Contributor Author

Pushed a fix for a cache invalidation bug I discovered during testing: 91afdfa

Bug: The cached dependency graph stores live ConfiguredTarget objects whose Target references use reference identity (ObjectIdentifier) for hash(into:) / ==. When the PIF is re-transferred, IncrementalPIFLoader creates new Target objects even when the content is unchanged. The workspace signature (content-based) stays the same → cache hit → stale Target references → dictionary lookups in targetDependencies and provisioningInputs silently fail → build skips recompilation of changed source files.

Fix: Include ObjectIdentifier(workspaceContext.workspace) in the cache signature. Same Workspace instance (reused by the incremental loader within a session) → cache hit. New Workspace instance (PIF re-transferred) → cache miss. This is correct because Target references are tied to the Workspace instance lifetime.

This also partially addresses the session-scoping feedback — within a single session the Workspace identity naturally partitions the cache, so cross-session stale hits cannot occur as long as different sessions use different Workspace instances.

@jverkoey jverkoey marked this pull request as ready for review February 27, 2026 00:29
When Xcode re-transfers the PIF after source-only changes, the workspace
object identity changes but the graph structure is identical. Instead of
recomputing the full dependency graph (~9.5s for large projects), detect
content-signature matches and remap all Target references to point at the
new workspace's objects.

The remap bails out if any cached Target GUID is missing from the new
workspace, falling through to a full recompute for structural changes.
@jverkoey
Copy link
Copy Markdown
Contributor Author

New commit: Remap cached target graph on PIF re-transfer

I've pushed a follow-up commit that addresses the performance cost of the workspace identity fix (91afdfa).

Problem

The ObjectIdentifier(workspace) inclusion in the cache signature is correct — it prevents serving stale Target references after PIF re-transfer. However, it's heavy-handed: every PIF re-transfer forces a full graph recompute (~9.5s for our 2100-target workspace), even when only source files changed and the graph structure is identical. In iterative development, Xcode re-transfers the PIF on every source change, so this cost was paid on every build.

Solution

When the full signature (with workspace identity) misses, check if any cached entry has the same content signature (everything except workspace identity). If found, remap all Target references in the cached graph to point at the new workspace's Target objects via workspace.target(for: guid), then store the remapped graph under the new full signature.

New methods on TargetBuildGraphCache:

  • computeContentSignature() — same as computeSignature() but omits workspaceIdentity
  • lookupByContentSignature() — scans cached entries for content match
  • remapGraph(_:to:) — replaces all Target references; returns nil if any GUID is missing (structural change)

The remap path in TargetBuildGraph.cached() sits between the exact-match hit and the full computation miss:

1. Full signature hit → return cached (unchanged, <1ms)
2. Content signature match + remap succeeds → return remapped (~100-150ms)
3. Both miss → full recompute (~9.5s)

Safety

remapGraph bails out (returns nil) if any cached Target GUID is not found in the new workspace. This means structural PIF changes (targets added, removed, or renamed) always fall through to a full rebuild. The remap only succeeds when the graph is structurally identical and only the object references differ.

Every Target reference in the remapped graph comes from workspace.target(for: guid) on the current workspace object, so the stale reference problem from before cannot recur.

Confirmed with Xcode iterative builds

Tested with 3 consecutive builds in Xcode where only a single source file was changed between each build. Instruments Time Profiler traces confirm:

  • TargetBuildGraph.cached() calls remapGraph() on each build (content signature match)
  • remapGraph takes ~136ms (vs 9.5s for a full recompute)
  • The bulk of the remap time is ConfiguredTarget.computeGuid during replacingTarget()
  • Zero full dependency resolution (TargetDependencyResolver.computeGraph) appears in the trace
  • Builds compile correctly with no missing dependencies

Tests

Added TargetBuildGraphCacheTests.swift with 7 tests:

  • Existing signature behavior (subobjects differ, identical match, different purposes)
  • Content signature matches across workspace objects
  • Remap produces valid graph with fresh Target references
  • Remap fails when a target is removed (returns nil → full rebuild)
  • Anti-regression: dictionary lookups with new-workspace ConfiguredTargets succeed after remap

Copy link
Copy Markdown
Collaborator

@owenv owenv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few additional comments, but at a high level I think we probably need to take a different approach to introducing caching of the dependency graph than what is currently in this PR. The current soundness holes will lead to incorrect incremental builds in real-world projects, for example, when modifying OTHER_LDFLAGS in an xcconfig to introduce a new target dependency.

I think my suggested approach would be something like:

  1. Refactor PIF loading to reduce unnecessary invalidation of model objects. This eliminates the need for fragile remapping of cached graphs in addition to potentially speeding up null builds a bit.
  2. Extract the workspaceContext from target dependency resolution. This is largely only used for access to the project model objects, user preferences, and the SDKRegistry. The first two can be modeled as explicit inputs to the computation, and the third can largely be treated as immutable for the lifetime of the service process.
  3. Extract the buildRequestContext from target dependency resolution. This is largely used for settings computation, which might introduce additional inputs to target dependency resolution, especially of implicit dependencies. Currently, changes to dependencies triggered by settings changes won't always correctly invalidate the cached graph.
  4. Eliminate feature flags which impact the dependency graph or move to modeling them as explicit inputs.
  5. Introduce a simpler in-memory cache tied to the lifetime of the session and cleared upon cleaning which explicitly tracks all of the inputs and does not attempt to remap stale project model objects.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to have been accidentally included in the history

/// index preparation vs normal build). A multi-entry cache ensures
/// these don't evict each other.
///
/// The cache is static (process-level) because `WorkspaceContext` is
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more concerned about ensuring cached dependency graphs don't persist in memory when a session is closed but the process is continuing to run builds for other sessions.

/// produces large dependency graphs that are rarely reused —
/// the memory overhead of caching them is not worth it.
static func shouldSkipCache(buildCommand: BuildCommand) -> Bool {
buildCommand.isPrepareForIndexing
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My earlier comment was suggesting looking at the QoS property itself rather than hardcoding in a special case for indexing.

}

/// Store a computed dependency graph for the given signature.
static func store(signature: Int, graph: CachedDependencyGraph) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we use existing cache types instead of reimplementing this from scratch, a lot of the surrounding code is redundant

/// The data we cache — everything needed by the
/// `TargetBuildGraph` memberwise init except the live context
/// objects (workspaceContext, buildRequest, buildRequestContext).
package struct CachedDependencyGraph: @unchecked Sendable {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this @unchecked?

/// Returns nil if any cached Target GUID is not found in the new
/// workspace, which means the PIF structure changed and a full
/// rebuild is needed.
package static func remapGraph(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This remapping feels very fragile and error-prone. I think the underlying issue is best addressed by the PIF transfer infrastructure rather than attempting to work around it here.


// Build command affects the early-return for
// assembly/preprocessor. BuildCommand has associated values
// that prevent auto-Hashable, so we hash only the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is incorrect

@@ -0,0 +1,51 @@
//===----------------------------------------------------------------------===//
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is needs to be added to CMakeLists.txt

// FIXME: Report cycles via the delegate.
//
/// Construct a new graph for the given build request.
/// Construct a new dependency graph with caching.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these comments are just repeating information from comments elsewhere

@@ -0,0 +1,722 @@
//===----------------------------------------------------------------------===//
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests currently fail to compile, and only cover the signature computation itself. I think this change needs more comprehensive testing of incremental builds which tigger hits/misses in the cache and verify rebuilds (or not)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants