Split PowdrConfig into per-stage GenerateConfig + SelectConfig#3763
Open
georgwiese wants to merge 52 commits into
Open
Split PowdrConfig into per-stage GenerateConfig + SelectConfig#3763georgwiese wants to merge 52 commits into
georgwiese wants to merge 52 commits into
Conversation
`PgoAdapter` gains `generate_apcs` (build + rank); the trim becomes a free `select_apcs(ranked, n, skip)`. `filter_blocks_and_create_apcs_with_pgo` stays as a default-impl one-shot for SP1's call site (auto-fills `apc_candidates = autoprecompiles + skip` for instruction/none; Cell overrides to skip the auto-fill). CLI gains a `generate-apcs` subcommand that caches under `generate/<h>`. `--pgo` and `--max-columns` move to `GenerateApcsArgs` (they shape the ranking, not the trim); `--apc-candidates <N>` caps the build size. `select-apcs` chains through `generate-apcs` and auto-fills `--apc-candidates = --autoprecompiles + --skip` for instruction/none so the common case stays ergonomic. Headline win: sweeping `--autoprecompiles` under `--pgo cell` against `--artifacts-dir` reuses a single generate-cache entry across iterations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cell PGO now always builds every eligible candidate. `--apc-candidates` is ignored with a warning (in both standalone and fused contexts) — the dynamic density ranking needs the full post-opt cost of every candidate. The pre-build heuristic that previously kicked in when the user set `--apc-candidates` is gone. For instruction/none in a fused pipeline (`select-apcs` and beyond), unset `--apc-candidates` now defaults to `--autoprecompiles` (was `--autoprecompiles + --skip`). Validation: error if `--autoprecompiles + --skip > --apc-candidates`. Standalone `generate-apcs <guest>` keeps "unset = build all" for instruction/none — there's no fused selection to derive a default from. The CLI auto-fill is applied before computing each stage's cache hash, so the resolved value lives in the cache key. Sweeping `--autoprecompiles` under `--pgo cell` still shares a single `generate/<h>` entry regardless of whether the user also passes `--apc-candidates`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Too noisy for the main README — the basic examples cover the common cases. Sweep workflows can be a doc/follow-up if needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Its only caller was openvm-riscv::compile_exe; inline generate_apcs + select_apcs there directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When `--autoprecompiles 0` is selected downstream, the CLI sets `apc_candidates = Some(0)` for Cell so the library short-circuits without building anything. Positive caps are still ignored with a warning (Cell's density ranking needs every candidate). Restores the early-exit that the previous fused `create_apcs_with_pgo` had for `autoprecompiles == 0`, expressed through `apc_candidates` since the build stage no longer sees the selection count directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The trim to (skip, autoprecompiles) now lives in `select_apcs`, so the greedy loop always runs to the column budget (or until candidates are exhausted). `max_selected` was always `usize::MAX` from the only caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The trait one-shot is no longer used inside powdr — openvm composes detect_blocks + generate_apcs + select_apcs directly. The internal helper run_filter_generate_select goes away with it. The SP1 fork at `powdr-labs/v6.0.2-apc-base` still calls this method. That fork needs an equivalent update (compose the three steps directly, or wrap them in a fork-local helper) before `cargo clippy --workspace` can pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each guest section runs `run_bench` with multiple `apcs` values (apc000, apc003, apc010, apc030, …) against the same guest source and profile input. With cell PGO (the default), the generate stage doesn't depend on `--autoprecompiles`, so adding a shared `--artifacts-dir` per (guest, profile-input) lets the second-and-later sweeps cache-hit on the expensive build step. Only the cheap select+setup re-run each time. `--apc-candidates-dir` has to be shared too because it's part of the generate-stage cache key (it's a per-stage arg, not a global). Set it per (guest, profile-input) under the same cache root. `apc_candidates.json` is now in the shared candidates dir; per-run effectiveness plots read from there. Per-block `apc_candidate_*` files are still cleaned up after each run, but they only get written on the first cache-miss anyway — subsequent cache-hits don't re-create them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
succinctlabs/sp1#2805 follows #3752's PgoAdapter split. Tracking that branch unblocks `cargo clippy --workspace`. Once SP1 merges #2805 into `powdr-labs/v6.0.2-apc-base`, this pin can revert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document the rule in code, restore the `--autoprecompiles + --skip` default in the help text and README (was stale after the function simplification), and drop the validation-error mention since the new function silently truncates instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the generic on-disk cache helpers into `powdr-autoprecompiles::staged_cache` and the per-stage runner into `powdr-openvm-riscv::pipeline` as `StagedPipeline`. cli-openvm-riscv's `Pipeline` collapses to a thin clap→library shim. `StagedPipeline` accepts caller-provided cache keys (anything `Hash`) and runs each stage through a `cached` helper, so downstream callers (openvm-eth, sp1) can adopt the same artifacts-dir caching without depending on the CLI.
PR #3763 changes the powdr autoprecompile API. The pre-existing pins target sp1 and openvm-eth commits that use the old API, so CI fails even though my refactor itself is correct. - sp1-benchmarks/Cargo.toml: bump sp1 dep branch from powdr-labs/generate-apcs-api → powdr-labs/split-powdr-config (succinctlabs/sp1#2830 — adopts the new powdr API). - .github/actions/patch-openvm-eth: bump ref from 05ed9cd3 (main) → e716b51a (powdr-labs/openvm-eth#8 tip, adopts the new powdr API). The "Verify openvm-eth ref is on main" check will (intentionally) fail until openvm-eth#8 merges — the comment in pr-tests-with-secrets.yml already accounts for this case, and the actual reth test runs against the compatible code.
The current `PgoConfig` enum bundles the ranking strategy (`PgoType`) with a heavy `ExecutionProfile` payload. The payload is *output* of a guest execution, not configuration — keeping it under the `*Config` name is confusing and forces the type to be unhashable. Pure rename in this commit. The next commit reclaims the `PgoConfig` name for a new type that's actually a config (Hash-able, holds only inputs), and threads it through `StagedPipeline`'s cache keys.
Reclaim the `PgoConfig` name for what's actually configuration:
```rust
pub struct PgoConfig {
pub pgo_type: PgoType,
pub max_columns: Option<usize>,
pub inputs: Vec<u8>, // opaque, hashed verbatim
}
```
`inputs` is the serialized form of whatever the `make_*` closures need
(typically the guest stdin). The closures take `(&Guest, &[u8])` and are
expected to round-trip the bytes back to their typed form — they're
pure functions of their arguments instead of relying on captured state.
`StagedPipeline` now hashes `(guest_hash, generate, pgo_config)` for the
generate stage and `(guest_hash, generate, pgo_config, select)` for
select/setup. Both stages include `pgo_config` — fixes a real bug where
switching `--pgo cell` ↔ `--pgo instruction` with the same select args
returned a stale select-stage blob from the cache (the upstream ranking
would have changed but the select hash didn't see it).
The `input_fp: &impl Hash` parameter is dropped (everything it carried
is now in `pgo_config.inputs`). CLI shim builds `PgoConfig` from clap
args + `serde_cbor::to_vec(&profile_input)`.
`compile_exe` is unaffected — it takes `PgoData` (the renamed enum
carrying the computed `ExecutionProfile`) directly and doesn't go
through `StagedPipeline`.
`Pipeline` was a thin shim around `StagedPipeline` whose only real job was running `compile_openvm(...)` once at construction and stashing `profile_args`. Its three `run_*` methods were pure plumbing from clap args to `StagedPipeline` calls — no policy, no ordering, no error handling. The `profile_args.profile_input` it stored was also reachable on every clap-arg struct the methods already took, so it was redundant. Replace with three free functions (`build_pipeline`, `run_generate_stage`, `run_select_stage`, `run_setup_stage`) and call them directly from `run_command`. Net −22 lines; the call graph is now flat (`run_command → run_X_stage → StagedPipeline::X`) with no intermediate struct.
Three threads, all on top of the upstream API change that made
`StagedPipeline::{select_apcs, setup}` take the closures directly
(instead of recursive `compute_*` callbacks):
1. Trait aliases for the closures (`pipeline.rs`).
The method signatures were dominated by their closure types — five
lines per parameter. Add two trait aliases (`MakePgoProfile`,
`MakeEmpiricalConstraints`) so each closure parameter becomes one
line. Each trait has a blanket impl over the matching `FnOnce`
so call sites are unchanged.
2. Restore cache-hierarchy laziness in `select_apcs` / `setup`.
The flat API was inadvertently eager: `let ranked = self.generate_apcs(...)`
ran *before* `cached(...)` checked the select-stage hit, so a
select-cache hit still paid for a generate-stage blob load + cbor
deserialize (and setup hits paid for both). Move the recursive
`self.generate_apcs(...)` / `self.select_apcs(...)` call *inside*
the `cached(...)` closure with `move`; the closures it captures
are forwarded down only on a miss.
3. CLI `SelectArgs::pipeline_inputs()` helper (`main.rs`).
The four chained call sites (SelectApcs / Setup / Execute / Prove)
all derived `(GenerateConfig, SelectConfig, PgoConfig)` the same
way — three lines of `with_select_defaults` / `pgo_config_from_args`.
Move into a method on `SelectArgs` so each call site is one line.
Without this, external callers of compile_exe (openvm-eth, sp1, ...) get config.apc_candidates = None and the Instruction/None PGO build loop fans out to every eligible block, then select_apcs throws all but `autoprecompiles + skip` away. Factor the existing CLI policy into a shared default_apc_candidates() helper and apply it inside compile_exe so the convenience wrapper preserves pre-split semantics for all callers.
Three call sites of `default_apc_candidates` collapse to one: - `default_apc_candidates`: short-circuit on `autoprecompiles == 0` so it is safe to apply unconditionally inside the library (standalone `generate-apcs` previously triggered `Some(0)` for Instruction/None). - `compile_exe`: drop the in-line default and route through `StagedPipeline` (no caching) — the convenience helper now shares the exact composition CLI / openvm-eth use. - CLI: drop `SelectArgs::set_apc_candidates_default` and pass the real `autoprecompiles` / `skip` to the generate-stage config (the cache key intentionally omits them so Cell-PGO sweeps still reuse the blob). - `InstructionPgo`: drop the stale `autoprecompiles == 0` short-circuit — `apc_candidates` is the load-bearing cap now, not the selection count.
The kitchen-sink `PowdrConfig` bundled fields from every pipeline stage into one struct. Generate-stage code therefore couldn't safely hash the config as a cache key (select-stage flags would invalidate the hash and break `--apc N` sweep reuse), forcing `StagedPipeline` to expose a `cache_key: &impl Hash` argument and pushing fingerprinting policy onto each caller — three callers, three conventions, easy to miss a field. This commit splits `PowdrConfig` along the natural pipeline boundary: * `GenerateConfig` (`autoprecompiles/src/lib.rs`) — fields the generate stage reads. Derives `Hash`; the library hashes it directly. * `SelectConfig` — `autoprecompiles` + `skip`. Derives `Hash`. * `degree_bound` keeps being passed plain into `setup`. * `GenerateConfig::with_select_defaults(pgo, select)` is the single policy definition for the `apc_candidates` cap default; callers chain it in one line when they know both halves. * `InstructionPgo` / `NonePgo` short-circuit on `apc_candidates == Some(0)` (mirroring Cell), set by `with_select_defaults` when `select.autoprecompiles == 0`. The stale `autoprecompiles == 0` field check in InstructionPgo is gone. `StagedPipeline`'s API drops `cache_key`; callers supply only an `input_fp: &impl Hash` covering anything hidden behind the closures (PGO stdin contents, RPC chain id, ...). The library hashes `(guest_hash, gen, pgo, max_columns, input_fp)` for generate and `(guest_hash, gen, select, input_fp)` for select/setup. Generate's hash omits `select` by construction, so a Cell-PGO `--apc N` sweep reuses the candidate ranking automatically (Cell's `with_select_defaults` returns `apc_candidates = None` regardless of selection size). `compile_exe` becomes a thin wrapper over `StagedPipeline` (no caching). `PowdrConfig` and `default_powdr_openvm_config` are deleted; a new `default_powdr_openvm_configs` returns the split pair. Downstream consumers (CLI shim, openvm-eth, sp1) migrate in follow-up commits.
428e741 to
89672a1
Compare
The hand-inlined load/save/move dance was just `cached(...)` written out by hand because `setup` consumes `self` (it has to move `self.guest` into `customize_exe::setup`). That's fine inside a `move` closure as long as we pull `artifacts_dir` and `degree_bound` out first. Behavior is unchanged; the body is now uniform with `generate_apcs` / `select_apcs`.
`compile_exe` is now only referenced by tests in `openvm-riscv/src/lib.rs` itself (production callers — CLI, openvm-eth, sp1 — all use `StagedPipeline` directly or, in sp1's case, the lower-level `powdr_autoprecompiles` API). The function no longer needs to be `pub`, so move it into the `mod tests` block. Two simplifications fall out: - The `empirical_constraints: EmpiricalConstraints` parameter was always passed as `EmpiricalConstraints::default()` by every call site (the only non-default usage lives in the CLI's `maybe_compute_empirical_constraints`, which goes through `StagedPipeline`, not `compile_exe`). Drop the parameter; hardcode the default inside. - Route through `StagedPipeline::new(.., None)` so the test path exercises the same code as production. Saves having two near-copies of the generate → select → setup chain. Also: skip the `make_pgo_profile` closure call for `PgoType::None` inside `StagedPipeline::generate_apcs`. The profile is discarded for None anyway, and skipping the closure means `compile_exe` can pass an "unreachable for None" closure without needing `ExecutionProfile: Default`.
georgwiese
commented
Jun 2, 2026
| /// Three modes for profiler guided optimization with different cost functions to sort the basic blocks by descending cost and select the most costly ones to accelerate. | ||
| #[derive(Default)] | ||
| pub enum PgoConfig { | ||
| pub enum PgoData { |
Collaborator
Author
There was a problem hiding this comment.
Renamed, because it's more than just a config (contains ExecutionProfile). There is a new struct called PgoConfig, which contains the information needed to compute the PGO data.
8e3fcc1 to
67c1bf5
Compare
georgwiese
commented
Jun 2, 2026
Comment on lines
+255
to
+256
| make_pgo_profile, | ||
| make_empirical_constraints, |
Collaborator
Author
There was a problem hiding this comment.
Having to repeat these here many times is annoying and can be fixed, see #3764. I#m not sure if that should be part of this PR though, because it causes a significant diff in the tests of openvm-riscv/src/lib.rs
…on fix openvm-eth split-powdr-config-adopt @ 9e1b613 restores the `POWDR_APC_CANDIDATES_DIR` wiring the StagedPipeline refactor had dropped, so the reth bench writes `apc_candidates.json` again (the previous nightly failed its `mv apcs/apc_candidates.json` step). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
openvm-eth split-powdr-config-adopt @ 9ffde2e adds the GenerateConfig import that makes precompute_prover_data compile after the latest tidy-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Large guests emit thousands of per-block `apc_candidate_*` snapshots into
the shared candidates_dir (e.g. the ~7k-block guest). `rm -f dir/apc_candidate_*`
expands the glob into one argv and overflows ARG_MAX on the bench runner
("Argument list too long", exit 126), failing the whole guest-bench job
after proving has already succeeded.
Use `find -maxdepth 1 -name 'apc_candidate_*' -delete`, which deletes
in-process without building a giant argument list. `apc_candidates.json`
doesn't match the pattern, so it's still preserved.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
StagedPipeline only orchestrated ISA-generic primitives (generate_apcs / select_apcs / setup) yet lived in powdr-openvm-riscv hardcoded to RiscvISA. Move it down into powdr-openvm as StagedPipeline<ISA: OpenVmISA> so it works for any OpenVM target, and add make_default_empirical_constraints there as a shared "no optimistic precompiles" MakeEmpiricalConstraints impl (now used by the openvm-riscv tests and openvm-eth). powdr-openvm-riscv no longer re-exports it; the CLI imports it from powdr-openvm directly. openvm-eth adopts the new path (ref bumped in the patch-openvm-eth action). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The sp1 adoption (succinctlabs/sp1#2830, branch powdr-labs/split-powdr-config) was slimmed to the irreducible GenerateConfig/SelectConfig migration — the staged artifact cache was dropped (no sp1 consumer used it). This empty commit re-runs CI so sp1-benchmarks builds against the new sp1 tip 676a63c8. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR makes the staged APC pipeline (
generate-apcs → select-apcs → setup)land on
mainas a single, self-contained change, and reworks the config typesthat flow through it. It supersedes #3752 — that PR introduced the stage
split against a still-shared
PowdrConfig; this one folds in the stage splitand splits the config along the same boundary, so #3752 should be closed in
favor of this one. No separate merge of #3752 is needed.
Context. #3750 (merged) introduced the per-stage subcommands and
--artifacts-dircaching, but the pipeline + cache logic lived entirely incli-openvm-riscv/src/main.rsas a CLI-onlyPipeline— unusable by othercallers. #3752 then split the fused
select-apcsintogenerate-apcs+select-apcs, still against the CLI-local pipeline and a sharedPowdrConfig.This PR lifts the pipeline into the library (
StagedPipelineinopenvm-riscv/src/pipeline.rs) and splits the config so it can be hashed as acache key, which is what lets openvm-eth reuse the same staged cache
instead of re-implementing it. The CLI becomes a thin shim over the shared
StagedPipeline. (sp1 adopts only the config split, not the cache — seeRelated PRs.)
There are three parts:
generate → select → setuprunner and its on-disk cache, previously aCLI-local
Pipeline(powdr-openvm-riscv: per-stage CLI subcommands +--artifacts-dircaching #3750) split into two stages (powdr-openvm-riscv: splitgenerate-apcsout ofselect-apcs#3752), now live in thepowdr-openvm-riscvlibrary asStagedPipeline. The CLI and openvm-ethshare one implementation; sp1 keeps its own minimal build path and does not
adopt the cache (see Related PRs).
PowdrConfigwithper-stage
GenerateConfig+SelectConfig+PgoConfig, which lets thelibrary hash the config as a cache key itself and drops the error-prone
cache_keyargument fromStagedPipeline.generate-apcsout ofselect-apcs#3752): the one-shotselect-apcsbecomes a cached
generate-apcs(build + rank candidates) feeding a cachedselect-apcs(trim the ranking), so an--autoprecompilessweep under--pgo cellreuses one generate-cache entry instead of rebuilding everycandidate per run.
Part 1 — the staged pipeline
select-apcsused to build APC candidates and pick the top N in one shot.This PR splits those into two cached stages:
generate-apcsselect-apcs--autoprecompiles(after--skip)setupexecuteprove--recursion(compression)--pgoand--max-columnsshape the ranking, so they live on the generatestage;
--autoprecompilesand--skipare the trim, so they live on select.A new
--apc-candidates <N>caps how many candidatesgenerate-apcsbuilds.--apc-candidatesresolution:--pgo cellalways builds every eligible candidate (its density rankingneeds each candidate's post-opt cost), so a user-set
--apc-candidatesiswarned-about and ignored. Crucially, the cap is not a function of
--autoprecompiles, so the generate hash is identical across a sweep.--pgo instruction|none: standalonegenerate-apcsbuilds all by default;in a fused pipeline (
select-apcsand beyond) the cap defaults to--autoprecompiles + --skip. Pin it explicitly to over-build once and sweepselection cheaply.
Cache hierarchy (
generate → select → setup): each stage hashes only itsown args plus a fingerprint of the transpiled guest
VmExe. A later-stage flag(
--input,--mock,--recursion,--metrics) never invalidates anearlier-stage cache; editing the guest invalidates everything downstream of it.
The cache is opt-in via the global
--artifacts-dir <DIR>; with no dir, everystage runs inline.
See
cli-openvm-riscv/README.mdfor the fullsurface and
--profile-inputvs--inputsemantics.Part 2 — the config split
PowdrConfigbundled fields from every pipeline stage into one struct. Thatforced generate-stage code to share a hash space with select-stage flags —
making it impossible for the library to safely hash the config as a cache key,
which in turn forced
StagedPipelineto expose a stringly-typedcache_key: &impl Hashargument that each external caller (CLI, openvm-eth,sp1) fingerprinted on its own. Three callers, three conventions — easy to drop
a field and silently serve stale cache hits.
This PR splits
PowdrConfigalong the natural pipeline boundary:GenerateConfig(autoprecompiles/src/lib.rs) holds only the fields thegenerate stage reads (
apc_candidates,superblock_max_bb_count,apc_max_instructions,apc_exec_count_cutoff,apc_candidates_dir_path,should_use_optimistic_precompiles,degree_bound). DerivesHash.SelectConfigholdsautoprecompiles+skip. DerivesHash.PgoConfigholdspgo_type,max_columns, and the opaqueinputsbyte string (serialized stdin). Derives
Hash, so it participates in boththe generate and select/setup hashes.
degree_boundkeeps being passed plain intosetup(no new wrapper).GenerateConfig::with_select_defaults(pgo, select)is the single policydefinition for the
apc_candidatescap default. Callers chain it in one linewhen they know both halves —
compile_exe, the CLI shim, openvm-eth, sp1 allshare it.
InstructionPgo/NonePgoshort-circuit onapc_candidates == Some(0)(mirroring Cell), set bywith_select_defaultswhen
select.autoprecompiles == 0. The staleautoprecompiles == 0fieldcheck is gone.
StagedPipeline's API dropscache_key; callers supply only aninput_fp: &impl Hashcovering anything hidden behind the closures (PGO stdincontents, RPC chain id, …). The library hashes
(guest_hash, generate, pgo)for generate and
(guest_hash, generate, pgo, select)for select/setup.Generate's hash omits
selectby construction, so a Cell-PGO--autoprecompiles Nsweep automatically reuses the candidate ranking — Cell'swith_select_defaultsreturnsapc_candidates = Noneregardless of selectionsize, so the generate hash is identical across the sweep.
compile_exebecomes a thin wrapper over the staged generate/select/setupfunctions (no caching).
PowdrConfiganddefault_powdr_openvm_configaredeleted.
Verifying the cache
All commands below use the
setupsubcommand (runsgenerate → select → setupwith no proving, so they're fast) and
RUST_LOG=infoto surface the cache loglines.
cached()logscache hit: <stage>/<hash>on a hit and runs the stageinline on a miss. The hashes are toolchain/guest-specific — what matters is
which stage logs a hit. Substitute any guest (e.g.
guest-keccak) forheavier, more realistic candidate counts. The
cargo runinvocation is spelledout in full on each line (rather than stashed in a
$BINvariable) so theblocks paste cleanly into both bash and zsh.
Because the cache is hierarchical and short-circuits at the outermost hit,
an identical rerun logs only
cache hit: setup/<h>(select and generate arenever consulted), whereas a run that only changes selection logs
cache hit: generate/<h>and re-runs select + setup.1. Cell PGO:
--autoprecompilessweep reuses one generate entryRun 1 logs
Generating autoprecompiles for all N blocks in parallel(coldbuild). Run 2 logs:
A third identical rerun of run 2 instead logs a single
cache hit: setup/<h>— the whole pipeline short-circuits.2. Instruction PGO: sweeping rebuilds unless
--apc-candidatesis pinnedUnder
--pgo instruction|nonethe build cap defaults to--autoprecompiles + --skip, so it lives in the generate hash and a naivesweep rebuilds every iteration:
rm -rf /tmp/powdr-cache-instr # ap=2 then ap=5, no --apc-candidates → generate hash differs each time. RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache-instr \ setup guest --profile-input 10 --autoprecompiles 2 --pgo instruction RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache-instr \ setup guest --profile-input 10 --autoprecompiles 5 --pgo instructionBoth runs log
Generating up to N autoprecompiles in parallel— nocache hit: generate. Pin--apc-candidatesto the top of the sweep to shareone generate entry:
rm -rf /tmp/powdr-cache-instr2 RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache-instr2 \ setup guest --profile-input 10 --autoprecompiles 2 --apc-candidates 5 --pgo instruction RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache-instr2 \ setup guest --profile-input 10 --autoprecompiles 5 --apc-candidates 5 --pgo instructionNow the second run logs
cache hit: generate/<h>.3. Later-stage flags don't invalidate earlier stages
--input/--mockare not part of any stage hash, so this logscache hit: setup/<h>and proceeds straight to the (mock) prover.Related PRs
--artifacts-dircaching #3750 (merged): introduced the per-stage subcommands +--artifacts-dircaching as a CLI-localPipeline.powdr-openvm-riscv: per-stage CLI subcommands +
--artifacts-dircaching #3750generate-apcsout ofselect-apcs#3752 (superseded by this PR): splitgenerate-apcsout ofselect-apcs. powdr-openvm-riscv: splitgenerate-apcsout ofselect-apcs#3752Downstream adopters of the split configs:
Adopt powdr's split GenerateConfig + SelectConfig + StagedPipeline. Also
adopts the now-library
StagedPipeline(openvm-eth depends onpowdr-openvm).Adopt powdr's split GenerateConfig + SelectConfig — the minimal
migration. Deliberately does not adopt the staged cache: every sp1
consumer builds APCs once per config, so only the dev CLI would have
benefited. sp1 doesn't depend on
powdr-openvm, so it couldn't share thelibrary
StagedPipelineanyway; the cache can be re-added later as anadditive layer.
The
.github/actions/patch-openvm-ethref is pinned to the matching openvm-ethbranch; the sp1 fork pin is updated in CI. Both compile against this branch.
Test plan
cargo build --release --bin powdr_openvm_riscvcleancargo clippy -p powdr-autoprecompiles -p powdr-openvm -p powdr-openvm-riscv -p cli-openvm-riscv --all-targets --features metrics -- -D warningscleancargo test --release -p powdr-autoprecompiles --lib(33 passed)commands — Cell sweep reuses
generate, instruction sweep needs--apc-candidatespinned, identical rerun / later-stage flags hitsetupguest_machine_pgo_modes,keccak_machine_pgo_modesforall 3 PGO modes (ranking unchanged by the split) — covered by CI
🤖 Generated with Claude Code