Skip to content

Conversation

@tgross
Copy link
Member

@tgross tgross commented Jan 13, 2026

Implement driver-side support in the exec and docker drivers for resource.memory_max = -1, which allows a reserve-only memory request without a hard limit for clusters with oversubscription enabled. This was already allowed by the server, but undocumented and unevenly supported by the built-in drivers.

Fixes: #25939
Ref: https://hashicorp.atlassian.net/browse/NMD-911
Docs: hashicorp/web-unified-docs#1629
Ref: hashicorp/nomad-driver-podman#488

Testing & Reproduction steps

Configure the server to allow oversubscription:

$ nomad operator scheduler set-config -memory-oversubscription=true

Run the tests shown below:

The values shown for the resulting cgroups are described in the kernel docs but in brief:

  • memory.max: the hard limit at which the task will get OOM'd
  • memory.low: the "best-effort" soft limit
  • memory.high: the throttling limit (not supported today in Nomad)

Contributor Checklist

  • Changelog Entry If this PR changes user-facing behavior, please generate and add a
    changelog entry using the make cl command.
  • Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
    ensure regressions will be caught.
  • Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
    and job configuration, please update the Nomad product documentation, which is stored in the
    web-unified-docs repo. Refer to the web-unified-docs contributor guide for docs guidelines.
    Please also consider whether the change requires notes within the upgrade
    guide
    . If you would like help with the docs, tag the nomad-docs team in this PR.

Reviewer Checklist

  • Backport Labels Please add the correct backport labels as described by the internal
    backporting document.
  • Commit Type Ensure the correct merge method is selected which should be "squash and merge"
    in the majority of situations. The main exceptions are long-lived feature branches or merges where
    history should be preserved.
  • Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
    within the public repository.
  • If a change needs to be reverted, we will roll out an update to the code within 7 days.

Changes to Security Controls

Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.

@tgross
Copy link
Member Author

tgross commented Jan 14, 2026

docker

docker job spec
job "example" {

  group "group" {

    task "task" {

      driver = "docker"
      user   = "www-data"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-vv", "-f", "-p", "8001", "-h", "/local"]
      }

      resources {
        cpu    = 100
        memory = 100
      }

    }
  }
}

Inspect the results, showing a hard limit but no soft reservation.

$ docker inspect f2516aa5564c | jq '.[0].HostConfig' | grep Memory
  "Memory": 104857600,
  "MemoryReservation": 0,
  "MemorySwap": 104857600,
  "MemorySwappiness": null,

$ cd /sys/fs/cgroup/system.slice/docker-f2516aa5564ca6c713dbabb21129aca9e2e3a5a708c1c129336ef7e2ac8b673d.scope
$ cat memory.max
104857600
$ cat memory.low
0
$ cat memory.high
max

Update resources to:

resources {
  cpu        = 100
  memory     = 100
  memory_max = 200
}

Inspect the results, showing both a reservation and a hard limit.

$ docker inspect bcc3287ca1cd | jq '.[0].HostConfig' | grep Memory
  "Memory": 209715200,
  "MemoryReservation": 104857600,
  "MemorySwap": 209715200,
  "MemorySwappiness": null,

$ cd /sys/fs/c/syst/docker-bcc3287ca1cd9b2cb578921c5d6f249eb26ba766e73603b68a615a20ef16d11e.scope
$ cat memory.max
209715200
$ cat memory.low
104857600
$ cat memory.high
max

Update resources to:

resources {
  cpu        = 100
  memory     = 100
  memory_max = -1
}

Inspect the results, showing soft-only reservation and not a hard limit.

$ docker inspect 54c0dfa2d141 | jq '.[0].HostConfig' | grep Memory
  "Memory": 0,
  "MemoryReservation": 104857600,
  "MemorySwap": 0,
  "MemorySwappiness": null,

$ cd /sys/fs/cgroup/system.slice/docker-54c0dfa2d14170370a0a95d4243d2817c25bd9840e9102324d92070a232fe7cf.scope
$ cat memory.max
max
$ cat memory.low
104857600
$ cat memory.high
max

@tgross
Copy link
Member Author

tgross commented Jan 14, 2026

exec

exec job spec
job "example" {

  group "web" {

    task "httpd" {

      driver = "exec"
      user   = "tim" # use your user, obviously =)

      config {
        command = "busybox"
        args    = ["httpd", "-vv", "-f", "-p", "8001", "-h", "/local"]
      }

      resources {
        cpu    = 100
        memory = 100
      }
    }
  }
}

Inspect the results, showing a hard limit but no soft reservation.

$ cd /sys/fs/cgroup/nomad.slice/share.slice/d529991e-e492-bacd-b566-8820243688b2.httpd.scope
$ cat memory.max
104857600
$ cat memory.low
0
$ cat memory.high
max

Update resources to:

resources {
  cpu        = 100
  memory     = 100
  memory_max = 200
}

Inspect the results, showing both a reservation and a hard limit.

$ cd /sys/fs/cgroup/nomad.slice/share.slice/934ec3ca-5b11-748b-941a-886f5d550640.httpd.scope
$ cat memory.max
209715200
$ cat memory.low
104857600
$ cat memory.high
max

Update resources to:

resources {
  cpu        = 100
  memory     = 100
  memory_max = -1
}

Inspect the results, showing soft-only reservation and not a hard limit.

$ cd /sys/fs/cgroup/nomad.slice/share.slice/5e07e188-42ca-8491-f021-3bf87801e4f3.httpd.scope
$ cat memory.max
max
$ cat memory.low
104857600
$ cat memory.high
max

@tgross
Copy link
Member Author

tgross commented Jan 14, 2026

raw_exec

To demonstrate that raw_exec's existing correct behavior was left intact.

raw_exec job spec
job "example" {

  group "group" {

    task "task" {

      driver = "raw_exec"

      config {
        command = "/usr/bin/busybox"
        args    = ["httpd", "-vv", "-f", "-p", "8001", "-h", "/srv"]
      }

      resources {
        cpu    = 100
        memory = 100
      }

    }
  }
}

Inspect the results, showing a hard limit but no soft reservation.

$ cd /sys/fs/cgroup/nomad.slice/share.slice/ce08db07-23d1-5e3b-44bd-f1594ead35d4.http.scope
$ cat memory.max
104857600
$ cat memory.low
0
$ cat memory.high
max

Update resources to:

resources {
  cpu        = 100
  memory     = 100
  memory_max = 200
}

Inspect the results, showing both a reservation and a hard limit.

$ cd /sys/fs/cgroup/nomad.slice/share.slice/9b6fbc68-7806-9d0a-320c-48169034e128.http.scope
$ cat memory.max
209715200
$ cat memory.low
104857600
$ cat memory.high
max

Update resources to:

resources {
  cpu        = 100
  memory     = 100
  memory_max = -1
}

Inspect the results, showing soft-only reservation and not a hard limit.

$ cd /sys/fs/cgroup/nomad.slice/share.slice/0d7c91ba-ddad-d777-0e4a-35ae81da2b83.http.scope
$ cat memory.max
max
$ cat memory.low
104857600
$ cat memory.high
max

@tgross tgross force-pushed the NMD911-soft-only-mem-oversubscription branch from 20f390f to fd06ab0 Compare January 14, 2026 19:43
@tgross tgross added the backport/1.11.x backport to 1.11.x release line label Jan 14, 2026
tgross added a commit to hashicorp/web-unified-docs that referenced this pull request Jan 14, 2026
Soft-limit-only memory oversubscription has been supported in the Nomad control
plane for a long time, but wasn't documented or plumbed through to several task
drivers. We've added support for this. Update the documentation for memory
oversubscription to note this is possible, and just generally clarify the
difference between the hard and soft limit.

Ref: https://hashicorp.atlassian.net/browse/NMD-911
Ref: hashicorp/nomad#27354
@tgross tgross force-pushed the NMD911-soft-only-mem-oversubscription branch from fd06ab0 to 53666dc Compare January 14, 2026 20:26
tgross added a commit to hashicorp/web-unified-docs that referenced this pull request Jan 14, 2026
Soft-limit-only memory oversubscription has been supported in the Nomad control
plane for a long time, but wasn't documented or plumbed through to several task
drivers. We've added support for this. Update the documentation for memory
oversubscription to note this is possible, and just generally clarify the
difference between the hard and soft limit.

Ref: https://hashicorp.atlassian.net/browse/NMD-911
Ref: hashicorp/nomad#27354
@tgross tgross force-pushed the NMD911-soft-only-mem-oversubscription branch from 53666dc to 738fe1e Compare January 14, 2026 20:34
tgross added a commit to hashicorp/web-unified-docs that referenced this pull request Jan 14, 2026
Soft-limit-only memory oversubscription has been supported in the Nomad control
plane for a long time, but wasn't documented or plumbed through to several task
drivers. We've added support for this. Update the documentation for memory
oversubscription to note this is possible, and just generally clarify the
difference between the hard and soft limit.

Ref: https://hashicorp.atlassian.net/browse/NMD-911
Ref: hashicorp/nomad#27354
tgross added a commit to hashicorp/nomad-driver-podman that referenced this pull request Jan 14, 2026
Implement driver-side support in the exec and docker drivers for
`resource.memory_max = -1`, which allows a soft-only memory limit for clusters
with oversubscription enabled. This was already allowed by the server, but
undocumented and unevenly supported by drivers.

Ref: hashicorp/nomad#27354
Ref: https://hashicorp.atlassian.net/browse/NMD-911
Ref: hashicorp/web-unified-docs#1629
tgross added a commit to hashicorp/nomad-driver-podman that referenced this pull request Jan 14, 2026
Implement driver-side support in the exec and docker drivers for
`resource.memory_max = -1`, which allows a soft-only memory limit for clusters
with oversubscription enabled. This was already allowed by the server, but
undocumented and unevenly supported by drivers.

Ref: hashicorp/nomad#27354
Ref: https://hashicorp.atlassian.net/browse/NMD-911
Ref: hashicorp/web-unified-docs#1629
tgross added a commit to hashicorp/nomad-driver-podman that referenced this pull request Jan 14, 2026
Implement driver-side support in the exec and docker drivers for
`resource.memory_max = -1`, which allows a soft-only memory limit for clusters
with oversubscription enabled. This was already allowed by the server, but
undocumented and unevenly supported by drivers.

Ref: hashicorp/nomad#27354
Ref: https://hashicorp.atlassian.net/browse/NMD-911
Ref: hashicorp/web-unified-docs#1629
@tgross tgross marked this pull request as ready for review January 14, 2026 21:25
@tgross tgross requested a review from a team as a code owner January 14, 2026 21:25
@tgross tgross requested a review from a team as a code owner January 14, 2026 21:25
tgross added a commit to hashicorp/web-unified-docs that referenced this pull request Jan 14, 2026
Soft-limit-only memory oversubscription has been supported in the Nomad control
plane for a long time, but wasn't documented or plumbed through to several task
drivers. We've added support for this. Update the documentation for memory
oversubscription to note this is possible, and just generally clarify the
difference between the hard and soft limit.

Ref: https://hashicorp.atlassian.net/browse/NMD-911
Ref: hashicorp/nomad#27354
@tgross
Copy link
Member Author

tgross commented Jan 15, 2026

Failing semgrep check is from existing issue #19833 but that'll get resolved soon

Implement driver-side support in the `exec` and `docker` drivers for
`resource.memory_max = -1`, which allows a reserve-only memory request without a
hard limit for clusters with oversubscription enabled. This was already allowed
by the server, but undocumented and unevenly supported by the built-in drivers.

Fixes: #25939
Ref: https://hashicorp.atlassian.net/browse/NMD-911
@tgross tgross force-pushed the NMD911-soft-only-mem-oversubscription branch from 738fe1e to bbfb18c Compare January 16, 2026 02:50
@tgross tgross changed the title drivers: support soft-only memory oversubscription drivers: support reserve-only memory oversubscription Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

implement soft-limit-only option for memory oversubscription

1 participant