BreadBoard E4 configs are now version-pinned to explicit upstream harness snapshots via:
config/e4_target_freeze_manifest.yaml
Target harnesses (Codex CLI, Claude Code, OpenCode, and related variants) evolve quickly.
Without explicit pinning, E4 can drift silently and parity assertions become ambiguous.
This manifest ensures each E4 config maps to:
- The upstream harness repo + commit snapshot.
- A release label used in parity reports.
- A concrete calibration anchor (tmux capture or replay session dump evidence).
Current enforced scope is every file matching:
agent_configs/*e4*.yaml
Each such config must have a corresponding manifest entry.
Run:
make e4-target-manifestEquivalent direct check:
python scripts/check_e4_target_freeze_manifest.py --jsonOptional strict evidence check:
python scripts/check_e4_target_freeze_manifest.py --strict-evidence --jsonOptional freshness check (fails stale calibration anchors):
python scripts/check_e4_target_freeze_manifest.py --strict-evidence --max-evidence-age-days 45 --jsonSnapshot coverage check:
make e4-snapshot-coverageEquivalent direct command:
python scripts/research/parity/check_e4_snapshot_coverage.py --jsonGenerate a dry-run update plan from local harness clones in ../other_harness_refs:
make e4-target-refresh-planEquivalent direct command:
python scripts/update_e4_target_freeze_manifest.py --check --json-out artifacts/conformance/e4_target_refresh_plan.jsonApply the refresh in-place:
python scripts/update_e4_target_freeze_manifest.py --writeCreate side-by-side versioned E4 config snapshots (preserve old files/rows):
python scripts/create_versioned_e4_snapshot_configs.py --snapshot-tag <tag>Example:
python scripts/create_versioned_e4_snapshot_configs.py --snapshot-tag codex_cli_0_105_0_20260304More explicit multi-harness example (recommended):
python scripts/create_versioned_e4_snapshot_configs.py --snapshot-tag codex0_1050_claude2_0_72_opencode1_2_6_20260304A nightly workflow (.github/workflows/e4-target-drift-audit-nightly.yml) checks
manifest-pinned commits against upstream remote HEADs and uploads:
artifacts/e4_target_drift_live_head_report.jsonartifacts/e4_target_drift_snapshot_report.json(when snapshot JSON is available)
Local equivalent:
make e4-target-drift-audit- Pull/update upstream harness reference repositories.
- Generate refresh plan and inspect drift:
make e4-target-refresh-planmake e4-target-drift-audit
- Capture new evidence:
- tmux provider scenario captures for interactive parity lanes.
- replay session dumps for OpenCode/other replay lanes.
- use manual recalibration (
.github/workflows/e4-recalibration-snapshot.yml,workflow_dispatch) for heavy refresh.
- Add/adjust manifest rows with:
- upstream commit + date,
- release label,
- updated evidence paths.
- Verify snapshot coverage:
make e4-snapshot-coverage
- Update E4 config behavior only as required for parity.
- Re-run E4 checks and conformance runs.
- Record the bump in release notes / parity docs.
After repository restore and fixture reindexing, we established an explicit strict replay probe baseline for current Codex/Claude/OpenCode parity surfaces.
Canonical strict probe run:
make e4-postrestore-strict-probeEquivalent explicit command:
python scripts/run_parity_replays.py --strict \
--scenario claude_e4_refresh_ping_replay_20260304 \
--scenario opencode_patch_todo_sentinel_replay \
--scenario opencode_glob_grep_sentinel_replay \
--scenario opencode_toolcall_repair_sentinel_replay \
--scenario codex_cli_mvi_patch_v2_replay \
--scenario codex_cli_subagent_sync_replay \
--scenario codex_cli_subagent_async_replay \
--parity-run-id e4_postrestore_strict_probe_<utc_timestamp>Baseline semantics:
- Codex modern lanes target
bitwise_trace(0.105.0event schema). - OpenCode patch/todo sentinel targets
normalized_trace(deterministic replay with restored golden workspace snapshot). - OpenCode glob/grep + toolcall-repair sentinels target
bitwise_trace. - Claude refresh ping replay lane targets
normalized_tracefor low-spend deterministic post-restore probes when older protofs/phase8 replay fixtures are unavailable in the restored tree.
Primary evidence references from this tranche:
artifacts/parity_runs/codex_capture_refresh_20260304_postfix/parity_summary.jsonartifacts/parity_runs/claude_opencode_replay_probe_strict_20260304_v2/parity_summary.json
This baseline is the current "go/no-go" strict replay probe set for post-restore E4 confidence and should be rerun whenever target harness version snapshots are bumped.
Each E4 config should include:
# e4_target_key: <config_stem>
# e4_target_manifest: config/e4_target_freeze_manifest.yamlThis is non-functional metadata for maintainers and reviewers.
- Scheduled CI should remain lightweight (drift visibility only).
- Do not add scheduled heavy recalibration/provider replay capture loops to GitHub Actions.