Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 28: add shrink key to resource acquisition response #447

Merged
merged 2 commits into from
Feb 18, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 15 additions & 9 deletions spec_28.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,6 @@ module to the scheduler. The responses to this RPC define the resource
set available for scheduling, and mark targets *up* or *down* as
availability changes.

Version 1 of this protocol supports a static resource set per Flux instance.
Resource *grow* and *shrink* are to be handled by a future protocol revision.

Design Criteria
***************

Expand Down Expand Up @@ -142,6 +139,13 @@ down
for scheduling. The idset only contains targets that are transitioning,
not the full set of unavailable targets.

shrink
(string) RFC 22 idset of execution targets that have been removed from
the instance and therefore should no longer be considered available
for scheduling. For backwards compatibility, targets in the ``shrink``
key SHALL also appear in the ``down`` key of the same response. If a
scheduler supports ``shrink`` then the ``shrink`` key SHALL take precedence.

property-add
(object) RFC 20 conforming properties object containing properties that
should be added to the specified execution targets. When present, this
Expand All @@ -167,15 +171,17 @@ Example:

{
"up": "3-6",
"down": "2"
"down": "2,6",
"shink": "6",
"property-add": { "foo": "0-1" },
"property-remove" { "bar": "3" }
"property-remove": { "bar": "3" }
}

If down resources are assigned to a job, the scheduler SHALL NOT raise an
exception on the job. The execution system takes the active role in handling
failures in this case. Eventually the scheduler will receive a ``sched.free``
request for the offline resources.
If removed (``shrink``) or down resources are assigned to a job, the
scheduler SHALL NOT raise an exception on the job. The execution system
takes the active role in handling failures in this case. Eventually the
scheduler will receive a ``sched.free`` request for the offline or removed
resources.

.. note::
*down* encompasses both crashed and drained execution targets.
Expand Down