Scope
- Shared infrastructure for adversarial interpretability across competitive games (chess, diplomacy, etc.).
- Common evals, visualization, data reports, model wrappers, and commands to run training jobs
Non-goals
- Forcing similar model architectures or training code across all experiments.
- Over-abstracting before concrete use cases exist.
- Standardised analysis for all experiements - just want some consistency in final presentation (plots/tables)
Directory layout
- docs/
- environments/
- chess_probe/
 
- libs/
- evals/
- visualization/
 
- configs/
- examples/
 
- scripts/
Shared libraries
- evals/: engine-eval delta, Elo, deception metrics (PR/recall), cost tracking.
- visualization/: plotting helpers and experiment dashboards.
- probes/: soft-token and residual injection modules with small, clear APIs.
- engines/: thin wrappers for Stockfish/Lc0 or other evaluators.
Runners
- TRL PPO (single-turn for chess-probe) with probe-only optimization.
- Verifier- or agent-tooling adapters for multi-step environments.
Results and experiment tracking
- Location: write all outputs under results/<env>/<experiment_name>/<YYYYMMDD_HHMMSS-<run_id>>/.- Example: results/chess_probe/probe_ablation/20250115_142530-a1b2c3/
 
- Example: 
- Contents inside a run directory:
- config.yaml(or- .toml): exact configuration used for the run (copied from- --configor auto-dumped resolved config)
- metadata.json: immutable run metadata- git commit, branch, dirty flag; user, host; Python/CUDA versions; random seeds
- full invocation (command, args, PYTHONPATH), environment name, library versions (optionallypip freeze)
 
- logs/: captured stdout/stderr/wandb
- plots/: generated figures for quick inspection
- artifacts/: model/probe checkpoints and large outputs (consider symlinks or pointer files if we need to store stuff elsewhere)
- samples/: qualitative samples (games, traces, prompts/responses)
- metrics/: summary metrics from experiment
 
- Script conventions (strongly recommended):
- --config path/to/config.yamland- --experiment-name <slug>
- --output-dir results/(default) so scripts create the full run path automatically
- --notes "short freeform note"saved in- metadata.json
- On startup: create the run directory, copy the config, write metadata.json
- During training/eval: append metrics to metrics.jsonl, write plots and artifacts under the run directory
 
- Remote trackers: optionally mirror metrics to W&B or MLflow, but the filesystem record above is the source of truth for reproducibility.
Index and discovery
- An append-only index is maintained at results/index/runs_index.jsonlfor fast discovery.
- New runs are auto-indexed:
- On entry via gamescope.libs.run_utils.capture_metadata()or therun_context(...)context manager (writes astartevent)
- On exit via gamescope.libs.run_utils.mark_status()(writes anendevent with exit reason)
 
- On entry via 
- Artifact usage can be logged to surface interesting runs:
- Call gamescope.libs.run_utils.mark_artifact_used(path_to_artifact, reason="...")
- This writes <run_dir>/artifacts/USED_BY.jsonland anartifact_usedevent in the index
 
- Call 
CLI helpers
- List runs (non-junk by default, grouped by script, newest first; includes duration and usage counts):
uv run python scripts/find_run.py --results-root results- Backfill the index for existing runs:
uv run python scripts/reindex_runs.py --results-root resultsRun any experiment from a YAML file; a fresh run directory is created and the full config is recorded.
uv run python scripts/config_runner.py --config configs/examples/my_eval.yamlYAML shape:
command: environments/chess_probe/scripts/eval_qwen_bc.py
args:
  model_name_or_path: Qwen/Qwen3-8B-Base
  num_eval_data: 200
  results_dir: results/chess_probe
  save_jsonl: trueThe runner injects run_dir for downstream scripts (available as --run_dir if supported, otherwise in env as RUN_DIR).
Add a new environment
- Create environments/<env_name>/ with a README.md describing assumptions and dependencies.
- Reuse libs/ components where possible; avoid environment-specific logic in libs/ (or, if you need evals and they'd be relevant to multiple experiments, create them in libs/).
- Provide example configs under configs/examples/ to run your experiments.
- Add/modify scripts under scripts/ to run your experiment and collect results.
Licensing
- Preserve third-party licenses and headers. See THIRD_PARTY_NOTICES.md.
- Install uv (Linux/macOS):
curl -LsSf https://astral.sh/uv/install.sh | sh
uv syncThis creates a local virtual environment (e.g., .venv/) and installs the base project dependencies.
- Run scripts using the synced environment:
uv run python scripts/your_script.py --help- If you define optional extras for your environment, include them at run time:
uv run --with '.[your_extra]' python scripts/your_script.py ...Notes
- uv syncis only needed after changing dependencies or on first setup. For ephemeral runs without a full sync, you may also use- uv runwhich will resolve and execute in a temporary environment.