Skip to content

feat(evaluator): support memory limits for build and verify jobs #194

@fuyu0425

Description

@fuyu0425

Summary

Add config support for per-job memory limits on evaluator build and verify jobs.

Today we can set CRS container memory via crs_compose.<crs>.mem_limit and trial/runtime memory via resources.memory_per_trial, but evaluator-side build/verify jobs only expose concurrency and CPU controls.

Problem

In configs like experiment-configs/crsbench-except-afc-final-bugfinding/gce-atlantis-multilang-except-afc-final.yaml, the evaluator: block supports:

  • jobs
  • cores_per_job
  • build_jobs
  • build_cores_per_job
  • verify_jobs
  • verify_cores_per_job
  • idle_timeout
  • cpu_tag

There is no memory counterpart for build or verify jobs.

Current code paths also reflect that:

  • EvaluatorConfig has no memory fields and uses extra="forbid"
  • CI job enqueue only writes experiment_name / cpu_tag / scheduler ownership metadata
  • the CI supervisor allocates CPUs and creates cpuset cgroups, but does not assign memory limits to build/verify jobs
  • crs_compose.<crs>.mem_limit only applies to CRS compose services, not evaluator build/verify job envelopes

This makes it hard to protect evaluator hosts from OOM pressure when build jobs or verification jobs are memory-heavy.

Requested Behavior

Support explicit memory controls for evaluator jobs, ideally with split overrides similar to CPU knobs, for example:

evaluator:
  build_jobs: 14
  build_cores_per_job: 16
  build_memory_per_job: "64G"
  verify_jobs: 14
  verify_cores_per_job: 16
  verify_memory_per_job: "32G"

Reasonable fallback semantics would also help, for example:

  • optional shared default like memory_per_job
  • split overrides taking precedence over the shared default

Expected Scope

  • Schema support for evaluator memory-per-job fields
  • Configless/cloud metadata transport if needed
  • Enqueue and supervisor plumbing so build/verify jobs receive memory metadata
  • Cgroup enforcement for evaluator build/verify jobs, not just cpuset assignment
  • docs/examples for the new knobs

Notes

This is a tracking request only, not an immediate implementation request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions