Skip to content

Split PowdrConfig into per-stage GenerateConfig + SelectConfig#3763

Open
georgwiese wants to merge 52 commits into
mainfrom
split-powdr-config
Open

Split PowdrConfig into per-stage GenerateConfig + SelectConfig#3763
georgwiese wants to merge 52 commits into
mainfrom
split-powdr-config

Conversation

@georgwiese
Copy link
Copy Markdown
Collaborator

@georgwiese georgwiese commented Jun 2, 2026

Summary

This PR makes the staged APC pipeline (generate-apcs → select-apcs → setup)
land on main as a single, self-contained change, and reworks the config types
that flow through it. It supersedes #3752 — that PR introduced the stage
split against a still-shared PowdrConfig; this one folds in the stage split
and splits the config along the same boundary, so #3752 should be closed in
favor of this one. No separate merge of #3752 is needed.

Context. #3750 (merged) introduced the per-stage subcommands and
--artifacts-dir caching, but the pipeline + cache logic lived entirely in
cli-openvm-riscv/src/main.rs as a CLI-only Pipeline — unusable by other
callers. #3752 then split the fused select-apcs into generate-apcs +
select-apcs, still against the CLI-local pipeline and a shared PowdrConfig.
This PR lifts the pipeline into the library (StagedPipeline in
openvm-riscv/src/pipeline.rs) and splits the config so it can be hashed as a
cache key, which is what lets openvm-eth reuse the same staged cache
instead of re-implementing it. The CLI becomes a thin shim over the shared
StagedPipeline. (sp1 adopts only the config split, not the cache — see
Related PRs.)

There are three parts:

  1. The staged pipeline → library (the point of this PR): the
    generate → select → setup runner and its on-disk cache, previously a
    CLI-local Pipeline (powdr-openvm-riscv: per-stage CLI subcommands + --artifacts-dir caching #3750) split into two stages (powdr-openvm-riscv: split generate-apcs out of select-apcs #3752), now live in the
    powdr-openvm-riscv library as StagedPipeline. The CLI and openvm-eth
    share one implementation; sp1 keeps its own minimal build path and does not
    adopt the cache (see Related PRs).
  2. The config split (new here): replace the monolithic PowdrConfig with
    per-stage GenerateConfig + SelectConfig + PgoConfig, which lets the
    library hash the config as a cache key itself and drops the error-prone
    cache_key argument from StagedPipeline.
  3. The stage split (carried over from powdr-openvm-riscv: split generate-apcs out of select-apcs #3752): the one-shot select-apcs
    becomes a cached generate-apcs (build + rank candidates) feeding a cached
    select-apcs (trim the ranking), so an --autoprecompiles sweep under
    --pgo cell reuses one generate-cache entry instead of rebuilding every
    candidate per run.

Part 1 — the staged pipeline

select-apcs used to build APC candidates and pick the top N in one shot.
This PR splits those into two cached stages:

Command Stages run
generate-apcs Profile + build & rank APC candidates
select-apcs … + trim the ranking to --autoprecompiles (after --skip)
setup … + assemble the program with selected APCs
execute … + run the guest in interpreted mode
prove … + STARK proof, optionally with --recursion (compression)

--pgo and --max-columns shape the ranking, so they live on the generate
stage; --autoprecompiles and --skip are the trim, so they live on select.
A new --apc-candidates <N> caps how many candidates generate-apcs builds.

--apc-candidates resolution:

  • --pgo cell always builds every eligible candidate (its density ranking
    needs each candidate's post-opt cost), so a user-set --apc-candidates is
    warned-about and ignored. Crucially, the cap is not a function of
    --autoprecompiles, so the generate hash is identical across a sweep.
  • --pgo instruction|none: standalone generate-apcs builds all by default;
    in a fused pipeline (select-apcs and beyond) the cap defaults to
    --autoprecompiles + --skip. Pin it explicitly to over-build once and sweep
    selection cheaply.

Cache hierarchy (generate → select → setup): each stage hashes only its
own args plus a fingerprint of the transpiled guest VmExe. A later-stage flag
(--input, --mock, --recursion, --metrics) never invalidates an
earlier-stage cache; editing the guest invalidates everything downstream of it.
The cache is opt-in via the global --artifacts-dir <DIR>; with no dir, every
stage runs inline.

See cli-openvm-riscv/README.md for the full
surface and --profile-input vs --input semantics.


Part 2 — the config split

PowdrConfig bundled fields from every pipeline stage into one struct. That
forced generate-stage code to share a hash space with select-stage flags —
making it impossible for the library to safely hash the config as a cache key,
which in turn forced StagedPipeline to expose a stringly-typed
cache_key: &impl Hash argument that each external caller (CLI, openvm-eth,
sp1) fingerprinted on its own. Three callers, three conventions — easy to drop
a field and silently serve stale cache hits.

This PR splits PowdrConfig along the natural pipeline boundary:

  • GenerateConfig (autoprecompiles/src/lib.rs) holds only the fields the
    generate stage reads (apc_candidates, superblock_max_bb_count,
    apc_max_instructions, apc_exec_count_cutoff, apc_candidates_dir_path,
    should_use_optimistic_precompiles, degree_bound). Derives Hash.
  • SelectConfig holds autoprecompiles + skip. Derives Hash.
  • PgoConfig holds pgo_type, max_columns, and the opaque inputs
    byte string (serialized stdin). Derives Hash, so it participates in both
    the generate and select/setup hashes.
  • degree_bound keeps being passed plain into setup (no new wrapper).
  • GenerateConfig::with_select_defaults(pgo, select) is the single policy
    definition for the apc_candidates cap default. Callers chain it in one line
    when they know both halves — compile_exe, the CLI shim, openvm-eth, sp1 all
    share it.
  • InstructionPgo / NonePgo short-circuit on
    apc_candidates == Some(0) (mirroring Cell), set by with_select_defaults
    when select.autoprecompiles == 0. The stale autoprecompiles == 0 field
    check is gone.

StagedPipeline's API drops cache_key; callers supply only an
input_fp: &impl Hash covering anything hidden behind the closures (PGO stdin
contents, RPC chain id, …). The library hashes (guest_hash, generate, pgo)
for generate and (guest_hash, generate, pgo, select) for select/setup.
Generate's hash omits select by construction, so a Cell-PGO --autoprecompiles N sweep automatically reuses the candidate ranking — Cell's
with_select_defaults returns apc_candidates = None regardless of selection
size, so the generate hash is identical across the sweep.

compile_exe becomes a thin wrapper over the staged generate/select/setup
functions (no caching). PowdrConfig and default_powdr_openvm_config are
deleted.


Verifying the cache

All commands below use the setup subcommand (runs generate → select → setup
with no proving, so they're fast) and RUST_LOG=info to surface the cache log
lines. cached() logs cache hit: <stage>/<hash> on a hit and runs the stage
inline on a miss. The hashes are toolchain/guest-specific — what matters is
which stage logs a hit. Substitute any guest (e.g. guest-keccak) for
heavier, more realistic candidate counts. The cargo run invocation is spelled
out in full on each line (rather than stashed in a $BIN variable) so the
blocks paste cleanly into both bash and zsh.

Because the cache is hierarchical and short-circuits at the outermost hit,
an identical rerun logs only cache hit: setup/<h> (select and generate are
never consulted), whereas a run that only changes selection logs
cache hit: generate/<h> and re-runs select + setup.

1. Cell PGO: --autoprecompiles sweep reuses one generate entry

rm -rf /tmp/powdr-cache

# Cold: builds & ranks every eligible candidate.
RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache \
    setup guest --profile-input 10 --autoprecompiles 2 --pgo cell

# Different --autoprecompiles: generate is reused, only select + setup re-run.
RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache \
    setup guest --profile-input 10 --autoprecompiles 10 --pgo cell

Run 1 logs Generating autoprecompiles for all N blocks in parallel (cold
build). Run 2 logs:

INFO cache hit: generate/<h>      <- ranking reused across the sweep
INFO Adjust the program ...       <- select + setup re-run (cheap)
INFO Setup completed.

A third identical rerun of run 2 instead logs a single
cache hit: setup/<h> — the whole pipeline short-circuits.

2. Instruction PGO: sweeping rebuilds unless --apc-candidates is pinned

Under --pgo instruction|none the build cap defaults to
--autoprecompiles + --skip, so it lives in the generate hash and a naive
sweep rebuilds every iteration:

rm -rf /tmp/powdr-cache-instr
# ap=2 then ap=5, no --apc-candidates → generate hash differs each time.
RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache-instr \
    setup guest --profile-input 10 --autoprecompiles 2 --pgo instruction
RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache-instr \
    setup guest --profile-input 10 --autoprecompiles 5 --pgo instruction

Both runs log Generating up to N autoprecompiles in parallelno
cache hit: generate. Pin --apc-candidates to the top of the sweep to share
one generate entry:

rm -rf /tmp/powdr-cache-instr2
RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache-instr2 \
    setup guest --profile-input 10 --autoprecompiles 2 --apc-candidates 5 --pgo instruction
RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache-instr2 \
    setup guest --profile-input 10 --autoprecompiles 5 --apc-candidates 5 --pgo instruction

Now the second run logs cache hit: generate/<h>.

3. Later-stage flags don't invalidate earlier stages

# After any cached setup run above, re-proving with a new --input / --mock
# hits the setup cache rather than rebuilding the program:
RUST_LOG=info cargo run -r --bin powdr_openvm_riscv -- --artifacts-dir /tmp/powdr-cache \
    prove guest --profile-input 10 --autoprecompiles 10 --pgo cell --input 123 --mock

--input / --mock are not part of any stage hash, so this logs
cache hit: setup/<h> and proceeds straight to the (mock) prover.

All three scenarios above were run locally on this branch (release build,
guest crate) and produced exactly the logged behavior described.


Related PRs

Downstream adopters of the split configs:

  • powdr-labs/openvm-eth#8
    Adopt powdr's split GenerateConfig + SelectConfig + StagedPipeline. Also
    adopts the now-library StagedPipeline (openvm-eth depends on
    powdr-openvm).
  • succinctlabs/sp1#2830
    Adopt powdr's split GenerateConfig + SelectConfig — the minimal
    migration. Deliberately does not adopt the staged cache: every sp1
    consumer builds APCs once per config, so only the dev CLI would have
    benefited. sp1 doesn't depend on powdr-openvm, so it couldn't share the
    library StagedPipeline anyway; the cache can be re-added later as an
    additive layer.

The .github/actions/patch-openvm-eth ref is pinned to the matching openvm-eth
branch; the sp1 fork pin is updated in CI. Both compile against this branch.

Test plan

  • cargo build --release --bin powdr_openvm_riscv clean
  • cargo clippy -p powdr-autoprecompiles -p powdr-openvm -p powdr-openvm-riscv -p cli-openvm-riscv --all-targets --features metrics -- -D warnings clean
  • cargo test --release -p powdr-autoprecompiles --lib (33 passed)
  • Cache behavior verified locally via the Verifying the cache
    commands — Cell sweep reuses generate, instruction sweep needs
    --apc-candidates pinned, identical rerun / later-stage flags hit setup
  • openvm-eth + sp1 compile against this branch (path-patch verified locally)
  • Snapshot tests guest_machine_pgo_modes, keccak_machine_pgo_modes for
    all 3 PGO modes (ranking unchanged by the split) — covered by CI
  • CI green

🤖 Generated with Claude Code

georgwiese and others added 30 commits May 19, 2026 11:23
`PgoAdapter` gains `generate_apcs` (build + rank); the trim becomes a
free `select_apcs(ranked, n, skip)`. `filter_blocks_and_create_apcs_with_pgo`
stays as a default-impl one-shot for SP1's call site (auto-fills
`apc_candidates = autoprecompiles + skip` for instruction/none; Cell
overrides to skip the auto-fill).

CLI gains a `generate-apcs` subcommand that caches under `generate/<h>`.
`--pgo` and `--max-columns` move to `GenerateApcsArgs` (they shape the
ranking, not the trim); `--apc-candidates <N>` caps the build size.
`select-apcs` chains through `generate-apcs` and auto-fills
`--apc-candidates = --autoprecompiles + --skip` for instruction/none so
the common case stays ergonomic.

Headline win: sweeping `--autoprecompiles` under `--pgo cell` against
`--artifacts-dir` reuses a single generate-cache entry across iterations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cell PGO now always builds every eligible candidate. `--apc-candidates`
is ignored with a warning (in both standalone and fused contexts) — the
dynamic density ranking needs the full post-opt cost of every candidate.
The pre-build heuristic that previously kicked in when the user set
`--apc-candidates` is gone.

For instruction/none in a fused pipeline (`select-apcs` and beyond),
unset `--apc-candidates` now defaults to `--autoprecompiles` (was
`--autoprecompiles + --skip`). Validation: error if
`--autoprecompiles + --skip > --apc-candidates`.

Standalone `generate-apcs <guest>` keeps "unset = build all" for
instruction/none — there's no fused selection to derive a default from.

The CLI auto-fill is applied before computing each stage's cache hash,
so the resolved value lives in the cache key. Sweeping `--autoprecompiles`
under `--pgo cell` still shares a single `generate/<h>` entry regardless
of whether the user also passes `--apc-candidates`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Too noisy for the main README — the basic examples cover the common
cases. Sweep workflows can be a doc/follow-up if needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Its only caller was openvm-riscv::compile_exe; inline generate_apcs +
select_apcs there directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When `--autoprecompiles 0` is selected downstream, the CLI sets
`apc_candidates = Some(0)` for Cell so the library short-circuits
without building anything. Positive caps are still ignored with a
warning (Cell's density ranking needs every candidate).

Restores the early-exit that the previous fused `create_apcs_with_pgo`
had for `autoprecompiles == 0`, expressed through `apc_candidates`
since the build stage no longer sees the selection count directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The trim to (skip, autoprecompiles) now lives in `select_apcs`, so the
greedy loop always runs to the column budget (or until candidates are
exhausted). `max_selected` was always `usize::MAX` from the only caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The trait one-shot is no longer used inside powdr — openvm composes
detect_blocks + generate_apcs + select_apcs directly. The internal
helper run_filter_generate_select goes away with it.

The SP1 fork at `powdr-labs/v6.0.2-apc-base` still calls this method.
That fork needs an equivalent update (compose the three steps directly,
or wrap them in a fork-local helper) before `cargo clippy --workspace`
can pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each guest section runs `run_bench` with multiple `apcs` values
(apc000, apc003, apc010, apc030, …) against the same guest source and
profile input. With cell PGO (the default), the generate stage doesn't
depend on `--autoprecompiles`, so adding a shared `--artifacts-dir`
per (guest, profile-input) lets the second-and-later sweeps cache-hit on
the expensive build step. Only the cheap select+setup re-run each time.

`--apc-candidates-dir` has to be shared too because it's part of the
generate-stage cache key (it's a per-stage arg, not a global). Set it
per (guest, profile-input) under the same cache root.

`apc_candidates.json` is now in the shared candidates dir; per-run
effectiveness plots read from there. Per-block `apc_candidate_*` files
are still cleaned up after each run, but they only get written on the
first cache-miss anyway — subsequent cache-hits don't re-create them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
succinctlabs/sp1#2805 follows #3752's PgoAdapter split.
Tracking that branch unblocks `cargo clippy --workspace`.

Once SP1 merges #2805 into `powdr-labs/v6.0.2-apc-base`, this pin can
revert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document the rule in code, restore the `--autoprecompiles + --skip`
default in the help text and README (was stale after the function
simplification), and drop the validation-error mention since the new
function silently truncates instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the generic on-disk cache helpers into
`powdr-autoprecompiles::staged_cache` and the per-stage runner into
`powdr-openvm-riscv::pipeline` as `StagedPipeline`. cli-openvm-riscv's
`Pipeline` collapses to a thin clap→library shim.

`StagedPipeline` accepts caller-provided cache keys (anything `Hash`) and
runs each stage through a `cached` helper, so downstream callers
(openvm-eth, sp1) can adopt the same artifacts-dir caching without
depending on the CLI.
PR #3763 changes the powdr autoprecompile API. The pre-existing pins
target sp1 and openvm-eth commits that use the old API, so CI fails
even though my refactor itself is correct.

- sp1-benchmarks/Cargo.toml: bump sp1 dep branch from
  powdr-labs/generate-apcs-api → powdr-labs/split-powdr-config
  (succinctlabs/sp1#2830 — adopts the new powdr API).
- .github/actions/patch-openvm-eth: bump ref from
  05ed9cd3 (main) → e716b51a (powdr-labs/openvm-eth#8 tip, adopts
  the new powdr API).

The "Verify openvm-eth ref is on main" check will (intentionally) fail
until openvm-eth#8 merges — the comment in pr-tests-with-secrets.yml
already accounts for this case, and the actual reth test runs against
the compatible code.
The current `PgoConfig` enum bundles the ranking strategy (`PgoType`) with
a heavy `ExecutionProfile` payload. The payload is *output* of a guest
execution, not configuration — keeping it under the `*Config` name is
confusing and forces the type to be unhashable.

Pure rename in this commit. The next commit reclaims the `PgoConfig`
name for a new type that's actually a config (Hash-able, holds only
inputs), and threads it through `StagedPipeline`'s cache keys.
Reclaim the `PgoConfig` name for what's actually configuration:

```rust
pub struct PgoConfig {
    pub pgo_type: PgoType,
    pub max_columns: Option<usize>,
    pub inputs: Vec<u8>,  // opaque, hashed verbatim
}
```

`inputs` is the serialized form of whatever the `make_*` closures need
(typically the guest stdin). The closures take `(&Guest, &[u8])` and are
expected to round-trip the bytes back to their typed form — they're
pure functions of their arguments instead of relying on captured state.

`StagedPipeline` now hashes `(guest_hash, generate, pgo_config)` for the
generate stage and `(guest_hash, generate, pgo_config, select)` for
select/setup. Both stages include `pgo_config` — fixes a real bug where
switching `--pgo cell` ↔ `--pgo instruction` with the same select args
returned a stale select-stage blob from the cache (the upstream ranking
would have changed but the select hash didn't see it).

The `input_fp: &impl Hash` parameter is dropped (everything it carried
is now in `pgo_config.inputs`). CLI shim builds `PgoConfig` from clap
args + `serde_cbor::to_vec(&profile_input)`.

`compile_exe` is unaffected — it takes `PgoData` (the renamed enum
carrying the computed `ExecutionProfile`) directly and doesn't go
through `StagedPipeline`.
`Pipeline` was a thin shim around `StagedPipeline` whose only real job
was running `compile_openvm(...)` once at construction and stashing
`profile_args`. Its three `run_*` methods were pure plumbing from clap
args to `StagedPipeline` calls — no policy, no ordering, no error
handling. The `profile_args.profile_input` it stored was also reachable
on every clap-arg struct the methods already took, so it was redundant.

Replace with three free functions (`build_pipeline`,
`run_generate_stage`, `run_select_stage`, `run_setup_stage`) and call
them directly from `run_command`. Net −22 lines; the call graph is now
flat (`run_command → run_X_stage → StagedPipeline::X`) with no
intermediate struct.
Three threads, all on top of the upstream API change that made
`StagedPipeline::{select_apcs, setup}` take the closures directly
(instead of recursive `compute_*` callbacks):

1. Trait aliases for the closures (`pipeline.rs`).
   The method signatures were dominated by their closure types — five
   lines per parameter. Add two trait aliases (`MakePgoProfile`,
   `MakeEmpiricalConstraints`) so each closure parameter becomes one
   line. Each trait has a blanket impl over the matching `FnOnce`
   so call sites are unchanged.

2. Restore cache-hierarchy laziness in `select_apcs` / `setup`.
   The flat API was inadvertently eager: `let ranked = self.generate_apcs(...)`
   ran *before* `cached(...)` checked the select-stage hit, so a
   select-cache hit still paid for a generate-stage blob load + cbor
   deserialize (and setup hits paid for both). Move the recursive
   `self.generate_apcs(...)` / `self.select_apcs(...)` call *inside*
   the `cached(...)` closure with `move`; the closures it captures
   are forwarded down only on a miss.

3. CLI `SelectArgs::pipeline_inputs()` helper (`main.rs`).
   The four chained call sites (SelectApcs / Setup / Execute / Prove)
   all derived `(GenerateConfig, SelectConfig, PgoConfig)` the same
   way — three lines of `with_select_defaults` / `pgo_config_from_args`.
   Move into a method on `SelectArgs` so each call site is one line.
Without this, external callers of compile_exe (openvm-eth, sp1, ...) get
config.apc_candidates = None and the Instruction/None PGO build loop fans
out to every eligible block, then select_apcs throws all but `autoprecompiles
+ skip` away. Factor the existing CLI policy into a shared
default_apc_candidates() helper and apply it inside compile_exe so the
convenience wrapper preserves pre-split semantics for all callers.
Three call sites of `default_apc_candidates` collapse to one:

- `default_apc_candidates`: short-circuit on `autoprecompiles == 0` so it
  is safe to apply unconditionally inside the library (standalone
  `generate-apcs` previously triggered `Some(0)` for Instruction/None).
- `compile_exe`: drop the in-line default and route through
  `StagedPipeline` (no caching) — the convenience helper now shares the
  exact composition CLI / openvm-eth use.
- CLI: drop `SelectArgs::set_apc_candidates_default` and pass the real
  `autoprecompiles` / `skip` to the generate-stage config (the cache key
  intentionally omits them so Cell-PGO sweeps still reuse the blob).
- `InstructionPgo`: drop the stale `autoprecompiles == 0` short-circuit
  — `apc_candidates` is the load-bearing cap now, not the selection
  count.
The kitchen-sink `PowdrConfig` bundled fields from every pipeline stage
into one struct. Generate-stage code therefore couldn't safely hash the
config as a cache key (select-stage flags would invalidate the hash and
break `--apc N` sweep reuse), forcing `StagedPipeline` to expose a
`cache_key: &impl Hash` argument and pushing fingerprinting policy onto
each caller — three callers, three conventions, easy to miss a field.

This commit splits `PowdrConfig` along the natural pipeline boundary:

* `GenerateConfig` (`autoprecompiles/src/lib.rs`) — fields the generate
  stage reads. Derives `Hash`; the library hashes it directly.
* `SelectConfig` — `autoprecompiles` + `skip`. Derives `Hash`.
* `degree_bound` keeps being passed plain into `setup`.
* `GenerateConfig::with_select_defaults(pgo, select)` is the single
  policy definition for the `apc_candidates` cap default; callers chain
  it in one line when they know both halves.
* `InstructionPgo` / `NonePgo` short-circuit on
  `apc_candidates == Some(0)` (mirroring Cell), set by
  `with_select_defaults` when `select.autoprecompiles == 0`. The stale
  `autoprecompiles == 0` field check in InstructionPgo is gone.

`StagedPipeline`'s API drops `cache_key`; callers supply only an
`input_fp: &impl Hash` covering anything hidden behind the closures
(PGO stdin contents, RPC chain id, ...). The library hashes
`(guest_hash, gen, pgo, max_columns, input_fp)` for generate and
`(guest_hash, gen, select, input_fp)` for select/setup. Generate's hash
omits `select` by construction, so a Cell-PGO `--apc N` sweep reuses
the candidate ranking automatically (Cell's `with_select_defaults`
returns `apc_candidates = None` regardless of selection size).

`compile_exe` becomes a thin wrapper over `StagedPipeline` (no
caching). `PowdrConfig` and `default_powdr_openvm_config` are deleted;
a new `default_powdr_openvm_configs` returns the split pair.

Downstream consumers (CLI shim, openvm-eth, sp1) migrate in follow-up
commits.
@georgwiese georgwiese force-pushed the split-powdr-config branch 2 times, most recently from 428e741 to 89672a1 Compare June 2, 2026 15:49
The hand-inlined load/save/move dance was just `cached(...)` written
out by hand because `setup` consumes `self` (it has to move `self.guest`
into `customize_exe::setup`). That's fine inside a `move` closure as
long as we pull `artifacts_dir` and `degree_bound` out first.

Behavior is unchanged; the body is now uniform with `generate_apcs` /
`select_apcs`.
`compile_exe` is now only referenced by tests in `openvm-riscv/src/lib.rs`
itself (production callers — CLI, openvm-eth, sp1 — all use
`StagedPipeline` directly or, in sp1's case, the lower-level
`powdr_autoprecompiles` API). The function no longer needs to be `pub`,
so move it into the `mod tests` block.

Two simplifications fall out:

- The `empirical_constraints: EmpiricalConstraints` parameter was
  always passed as `EmpiricalConstraints::default()` by every call
  site (the only non-default usage lives in the CLI's
  `maybe_compute_empirical_constraints`, which goes through
  `StagedPipeline`, not `compile_exe`). Drop the parameter; hardcode
  the default inside.
- Route through `StagedPipeline::new(.., None)` so the test path
  exercises the same code as production. Saves having two near-copies
  of the generate → select → setup chain.

Also: skip the `make_pgo_profile` closure call for `PgoType::None`
inside `StagedPipeline::generate_apcs`. The profile is discarded for
None anyway, and skipping the closure means `compile_exe` can pass an
"unreachable for None" closure without needing
`ExecutionProfile: Default`.
/// Three modes for profiler guided optimization with different cost functions to sort the basic blocks by descending cost and select the most costly ones to accelerate.
#[derive(Default)]
pub enum PgoConfig {
pub enum PgoData {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed, because it's more than just a config (contains ExecutionProfile). There is a new struct called PgoConfig, which contains the information needed to compute the PGO data.

@georgwiese georgwiese force-pushed the split-powdr-config branch from 8e3fcc1 to 67c1bf5 Compare June 2, 2026 20:40
Comment on lines +255 to +256
make_pgo_profile,
make_empirical_constraints,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to repeat these here many times is annoying and can be fixed, see #3764. I#m not sure if that should be part of this PR though, because it causes a significant diff in the tests of openvm-riscv/src/lib.rs

georgwiese and others added 8 commits June 3, 2026 11:27
…on fix

openvm-eth split-powdr-config-adopt @ 9e1b613 restores the
`POWDR_APC_CANDIDATES_DIR` wiring the StagedPipeline refactor had dropped,
so the reth bench writes `apc_candidates.json` again (the previous nightly
failed its `mv apcs/apc_candidates.json` step).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
openvm-eth split-powdr-config-adopt @ 9ffde2e adds the GenerateConfig
import that makes precompute_prover_data compile after the latest tidy-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Large guests emit thousands of per-block `apc_candidate_*` snapshots into
the shared candidates_dir (e.g. the ~7k-block guest). `rm -f dir/apc_candidate_*`
expands the glob into one argv and overflows ARG_MAX on the bench runner
("Argument list too long", exit 126), failing the whole guest-bench job
after proving has already succeeded.

Use `find -maxdepth 1 -name 'apc_candidate_*' -delete`, which deletes
in-process without building a giant argument list. `apc_candidates.json`
doesn't match the pattern, so it's still preserved.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
StagedPipeline only orchestrated ISA-generic primitives (generate_apcs /
select_apcs / setup) yet lived in powdr-openvm-riscv hardcoded to RiscvISA.
Move it down into powdr-openvm as StagedPipeline<ISA: OpenVmISA> so it works
for any OpenVM target, and add make_default_empirical_constraints there as a
shared "no optimistic precompiles" MakeEmpiricalConstraints impl (now used by
the openvm-riscv tests and openvm-eth).

powdr-openvm-riscv no longer re-exports it; the CLI imports it from
powdr-openvm directly. openvm-eth adopts the new path (ref bumped in the
patch-openvm-eth action).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The sp1 adoption (succinctlabs/sp1#2830, branch powdr-labs/split-powdr-config)
was slimmed to the irreducible GenerateConfig/SelectConfig migration — the
staged artifact cache was dropped (no sp1 consumer used it). This empty commit
re-runs CI so sp1-benchmarks builds against the new sp1 tip 676a63c8.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant