secure-agent-runner is standalone. Integrations should translate their native
tool, gateway, verifier, or agent concepts into RunJobRequest, then consume
RunJobResult. They must not be required for the Rust library, CLI, or HTTP
service.
An MCP server can expose sandbox/run as a thin tool wrapper around the runner
HTTP API:
MCP client
-> tools/call sandbox/run
-> adapter builds RunJobRequest
-> POST /v1/jobs
-> GET /v1/jobs/{job_id}
-> tool result contains RunJobResult or queued job id
Recommended tool arguments:
workspace_path: local workspace root approved by the MCP host;argv: non-empty argv array;cwd: relative working directory;env: explicit env additions;policy: command/env/artifact/backend policy chosen by the MCP server;trace: optional caller metadata.
The MCP adapter should reject shell strings before constructing the request unless the operator has explicitly enabled shell execution. The runner still performs its own validation and policy enforcement after the adapter check.
See ../examples/mcp-sandbox-run.json for a JSON-RPC-shaped example.
A gateway should own ingress policy before a request reaches the runner:
- authentication and caller identity;
- route and method authorization;
- tool allow, deny, or approval-required decisions;
- protected path checks;
- request body limits and timeouts;
- budget or rate-limit decisions;
- audit and trace correlation.
The runner should still own execution policy:
- exact command allowlist;
- shell denial;
- workspace snapshot boundaries;
- child environment allowlist;
- output and artifact caps;
- timeout;
- backend selection and backend-specific constraints.
A gateway handoff can be as simple as forwarding approved POST /v1/jobs
requests to agent-runner:
[[routes]]
id = "agent-runner-jobs"
kind = "http"
match.path_prefix = "/sandbox/jobs"
match.methods = ["POST"]
upstreams = ["http://127.0.0.1:3000"]
strip_prefix = "/sandbox"
timeout_ms = 30000
body_limit_bytes = 1048576Gateway decisions should be copied into trace metadata, for example:
{
"trace_id": "tr_gateway_001",
"run_id": "run_001",
"gateway_policy_action": "allow",
"gateway_policy_id": "agent-tools-v1",
"gateway_request_id": "gw_req_001"
}That metadata is recorded with the request and result but does not alter the core runner schema.
cl-agent already has an injectable CommandRunner protocol. A runner-backed
adapter should implement that protocol without changing cl-agent's verifier
model:
PythonRepoVerifier
-> CommandRunner.run(command, cwd, timeout, env)
-> build RunJobRequest
-> call agent-runner HTTP API or CLI
-> map RunJobResult to CommandResult
Mapping:
RunJobResult.exit_code->CommandResult.returncode;RunJobResult.stdout.text->CommandResult.stdout;RunJobResult.stderr.text->CommandResult.stderr;status = "timed_out"-> raisesubprocess.TimeoutExpired;status = "policy_denied"orbackend_unavailable-> return a non-zero command result with the stable error JSON in stderr, unless the caller wants those states to fail the verifier run immediately.
See ../examples/cl_agent_runner_stub.py for a minimal HTTP adapter sketch.
Ferrum-specific data belongs in generic trace metadata, not in the public
runner schema:
{
"trace": {
"trace_id": "tr_ferrum_001",
"run_id": "run_001",
"session_id": "sess_001",
"task_id": "task_001",
"episode_id": "ep_001",
"agent_id": "forge-agent",
"workspace_id": "repo_ferrum_demo",
"ferrum_stack_layer": "tooling"
}
}Ferrum, gateway, and cl-agent repositories can keep their own adapters near their native code. This crate should not depend on Ferrum types, gateway crates, MCP server crates, or cl-agent Python modules.