Summary
Add config support for per-job memory limits on evaluator build and verify jobs.
Today we can set CRS container memory via crs_compose.<crs>.mem_limit and trial/runtime memory via resources.memory_per_trial, but evaluator-side build/verify jobs only expose concurrency and CPU controls.
Problem
In configs like experiment-configs/crsbench-except-afc-final-bugfinding/gce-atlantis-multilang-except-afc-final.yaml, the evaluator: block supports:
jobs
cores_per_job
build_jobs
build_cores_per_job
verify_jobs
verify_cores_per_job
idle_timeout
cpu_tag
There is no memory counterpart for build or verify jobs.
Current code paths also reflect that:
EvaluatorConfig has no memory fields and uses extra="forbid"
- CI job enqueue only writes
experiment_name / cpu_tag / scheduler ownership metadata
- the CI supervisor allocates CPUs and creates cpuset cgroups, but does not assign memory limits to build/verify jobs
crs_compose.<crs>.mem_limit only applies to CRS compose services, not evaluator build/verify job envelopes
This makes it hard to protect evaluator hosts from OOM pressure when build jobs or verification jobs are memory-heavy.
Requested Behavior
Support explicit memory controls for evaluator jobs, ideally with split overrides similar to CPU knobs, for example:
evaluator:
build_jobs: 14
build_cores_per_job: 16
build_memory_per_job: "64G"
verify_jobs: 14
verify_cores_per_job: 16
verify_memory_per_job: "32G"
Reasonable fallback semantics would also help, for example:
- optional shared default like
memory_per_job
- split overrides taking precedence over the shared default
Expected Scope
- Schema support for evaluator memory-per-job fields
- Configless/cloud metadata transport if needed
- Enqueue and supervisor plumbing so build/verify jobs receive memory metadata
- Cgroup enforcement for evaluator build/verify jobs, not just cpuset assignment
- docs/examples for the new knobs
Notes
This is a tracking request only, not an immediate implementation request.
Summary
Add config support for per-job memory limits on evaluator build and verify jobs.
Today we can set CRS container memory via
crs_compose.<crs>.mem_limitand trial/runtime memory viaresources.memory_per_trial, but evaluator-side build/verify jobs only expose concurrency and CPU controls.Problem
In configs like
experiment-configs/crsbench-except-afc-final-bugfinding/gce-atlantis-multilang-except-afc-final.yaml, theevaluator:block supports:jobscores_per_jobbuild_jobsbuild_cores_per_jobverify_jobsverify_cores_per_jobidle_timeoutcpu_tagThere is no memory counterpart for build or verify jobs.
Current code paths also reflect that:
EvaluatorConfighas no memory fields and usesextra="forbid"experiment_name/cpu_tag/ scheduler ownership metadatacrs_compose.<crs>.mem_limitonly applies to CRS compose services, not evaluator build/verify job envelopesThis makes it hard to protect evaluator hosts from OOM pressure when build jobs or verification jobs are memory-heavy.
Requested Behavior
Support explicit memory controls for evaluator jobs, ideally with split overrides similar to CPU knobs, for example:
Reasonable fallback semantics would also help, for example:
memory_per_jobExpected Scope
Notes
This is a tracking request only, not an immediate implementation request.