Skip to content

Commit b556f06

Browse files
Vacant2333Monokaix
authored andcommitted
proposal-for-scheduling-suspension
Signed-off-by: Vacant2333 <[email protected]>
1 parent 93a68eb commit b556f06

File tree

1 file changed

+169
-0
lines changed
  • docs/proposals/scheduling-suspension

1 file changed

+169
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
---
2+
title: Support for resource scheduling suspend and resume capabilities
3+
authors:
4+
- "@Vacant2333"
5+
reviewers:
6+
- TBD
7+
approvers:
8+
- TBD
9+
10+
creation-date: 2024-10-13
11+
12+
---
13+
14+
# Support for resource scheduling suspend and resume capabilities
15+
16+
## Summary
17+
18+
<!--
19+
Karmada 目前已经允许用户通过 Suspension 来暂停和恢复资源的 propagation 行为,这提高了 Karmada 的灵活性。
20+
21+
但是我认为目前 Suspension 支持的不够彻底,在 Karmada 中资源的分发可以简单理解为两个阶段,资源的调度和资源的传播。
22+
目前传播阶段可以较为灵活的由用户暂停和恢复,但是调度阶段用户无法介入。
23+
24+
在 Pod/Job 资源中,Pod.Spec.SchedulingGates/Job.Spec.Suspend 提供了暂停调度的能力。用户可以较为轻松的基于这些字段来暂停负载的调度,
25+
通过自定义控制器来实现优先级/队列的能力后逐个恢复负载的调度。但是在 Karmada 中无法这样做到,目前 Karmada Scheduler 没有队列的能力,
26+
所有的负载遵循先来后到进行调度。我们希望通过调度暂停的形式在Karmada之外实现队列能力,从而将集群有限的资源优先调度给高优先级的负载和任务。
27+
28+
本文提供了一种在 Suspension 的基础之上,暂停资源调度的能力。
29+
-->
30+
31+
**Karmada** currently allows users to pause and resume resource propagation behavior through **Suspension**, enhancing Karmada's flexibility.
32+
33+
However, I believe the support for Suspension is not thorough enough. In Karmada, resource distribution can be simplified into two stages: the scheduling and the propagation of resources. Currently, the propagation stage can be paused and resumed flexibly by users, but the scheduling stage does not allow user intervention.
34+
35+
In the context of **Pod/Job** resources, `Pod.Spec.SchedulingGates` and `Job.Spec.Suspend` provide the ability to pause scheduling. Users can easily pause workload scheduling based on these fields and, with a custom controller, implement priority/queue capabilities to gradually resume workload scheduling. However, this is not achievable in Karmada. The current **Karmada Scheduler** lacks queuing capabilities, and all workloads are scheduled on a first-come, first-served basis. We hope to implement queuing capabilities outside of Karmada by pausing scheduling, thereby prioritizing the cluster's limited resources for high-priority workloads and tasks.
36+
37+
This article proposes a method to pause resource scheduling based on Suspension.
38+
39+
## Motivation
40+
41+
<!--
42+
当前的 Karmada 无法提供队列和优先级等能力,来将有限的资源分配给高优先级任务/负载的能力。当用户有这类需求时,他们只能手动按顺序创建
43+
PropagationPolicy 来优先调度。在负载和任务较少时这样没有问题,但是当一个联邦系统有大量的Job类负载时,手动无疑是效率低下和不精确的。
44+
45+
同时不同的用户会有不同的需求,如多租户/多队列/优先级队列调度能力。当 ResourceBinding 支持暂停调度后,用户就可以去实现自己的 Controller 并
46+
设计自己的队列系统。
47+
-->
48+
49+
**Karmada** currently lacks capabilities such as queuing and prioritization, which are necessary to allocate limited resources to high-priority tasks/workloads. When users have such needs, they can only manually create `PropagationPolicy` in sequence to prioritize scheduling. This is not a problem when there are fewer workloads and tasks, but it is inefficient and imprecise when a federated system has a large number of job-type workloads.
50+
51+
Different users also have different requirements, such as multi-tenant, multi-queue, and priority queue scheduling capabilities. Once `ResourceBinding` supports the suspension of scheduling, users will be able to implement their own Controllers and design their own queuing systems.
52+
53+
### Goals
54+
55+
- Provide the capability to **pause resource scheduling**
56+
- Provide the capability to **resume resource scheduling**
57+
58+
## Proposal
59+
60+
### User Stories
61+
62+
#### Story 1
63+
64+
<!--
65+
作为一名管理者,我希望 Karmada 能够允许我们通过一些方式在外部实现队列能力,从而在调度各部门的 Job 时能够限制各自的容量以及优先级,
66+
当资源不足时能够做到 Karmada 优先调度高优先级任务,这将帮助集群更好的利用有限的资源。
67+
-->
68+
69+
As a manager, I hope that **Karmada** could allow us to implement external queue capabilities, which would enable us to limit the capacity and priority of jobs from different departments when scheduling. When resources are scarce, Karmada should prioritize higher-priority tasks. This will help the cluster make better use of limited resources.
70+
71+
#### Story 2
72+
73+
<!--
74+
作为一名用户,我希望当我创建了负载和对应的 PropagationPolicy 之后不要立刻调度,直到我取消调度暂停。
75+
-->
76+
77+
As a user, I would like the scheduling not to start immediately after I have created the workload and the corresponding **PropagationPolicy**, until I lift the scheduling pause.
78+
79+
## Design Details
80+
81+
<!--
82+
通过拓展 PropagationPolicy/ClusterPropagationPolicy.Spec.Suspension,在其结构中我们引入了 SuspendScheduling 字段来表示暂停调度,
83+
同时此字段会传递到 ResourceBinding/ClusterResourceBinding 资源,Karmada Scheduler 组件需要根据 SuspendScheduling 来决定此时
84+
是否要开始调度对应的 ResourceBinding/ClusterResourceBinding。
85+
-->
86+
87+
By extending `PropagationPolicy/ClusterPropagationPolicy.Spec.Suspension`, we have introduced the `SuspendScheduling` field to indicate a pause in scheduling. This field is also propagated to `ResourceBinding/ClusterResourceBinding` resources. The **Karmada Scheduler** component needs to determine whether to start scheduling the corresponding `ResourceBinding/ClusterResourceBinding` based on `SuspendScheduling`.
88+
89+
### API Change
90+
91+
```go
92+
// Suspension defines the policy for suspending different aspects of propagation.
93+
type Suspension struct {
94+
...
95+
96+
// SuspendScheduling controls whether scheduling should be suspended.
97+
// +optional
98+
SuspendScheduling *bool `json:"suspendScheduling,omitempty"`
99+
}
100+
```
101+
102+
### User usage example
103+
104+
#### Story 1 example
105+
106+
<!--
107+
用户可以通过自定义的 Webhook + Controller 来实现自己期望的队列和多租户能力,通过 Webhook 在 ResourceBinding 创建时将其 Suspend,
108+
从而交由自定义的队列 Controller 处理,用户的 Workload/Job 就能够有序的按照如优先级来逐个调度。
109+
-->
110+
111+
Users can implement their desired queue and multi-tenancy capabilities through a custom **Webhook + Controller**. By using the Webhook to suspend the `ResourceBinding` at the time of creation, the custom queue Controller can take over, allowing users' `Workload/Job` to be scheduled in an orderly manner according to priority.
112+
113+
```yaml
114+
apiVersion: admissionregistration.k8s.io/v1
115+
kind: MutatingWebhookConfiguration
116+
metadata:
117+
name: volcano-admission-service-resourcebindings-mutate
118+
webhooks:
119+
- name: mutateresourcebindings.volcano.sh
120+
admissionReviewVersions:
121+
- v1
122+
clientConfig:
123+
url: https://volcano-global-webhook.volcano-global.svc:443/resourcebindings/mutate
124+
failurePolicy: Fail
125+
matchPolicy: Equivalent
126+
reinvocationPolicy: Never
127+
rules:
128+
- operations: ["CREATE"]
129+
apiGroups: ["work.karmada.io"]
130+
apiVersions: ["v1alpha2"]
131+
resources: ["resourcebindings"]
132+
scope: "Namespaced"
133+
sideEffects: None
134+
timeoutSeconds: 3
135+
```
136+
137+
#### Story 2 example
138+
139+
<!--
140+
用户在 PropagationPolicy 中设置 Deployment(default/nginx) 资源调度为暂停状态,Karmada 将不会开始调度该负载,
141+
同时也不会有对应的 Work 资源存在,只会创建出待调度的 ResourceBinding 资源。
142+
-->
143+
144+
When a user sets the scheduling of the `Deployment(default/nginx)` resource to a paused state in the `PropagationPolicy`, **Karmada** will not begin scheduling that workload. No corresponding `Work` resources will exist; only the `ResourceBinding` resource awaiting scheduling will be created.
145+
146+
```yaml
147+
apiVersion: policy.karmada.io/v1alpha1
148+
kind: PropagationPolicy
149+
metadata:
150+
name: nginx-propagation
151+
spec:
152+
resourceSelectors:
153+
- apiVersion: apps/v1
154+
kind: Deployment
155+
name: nginx
156+
suspension:
157+
suspendScheduling: true
158+
```
159+
160+
### Test Plan
161+
162+
#### UT
163+
164+
Add unit tests to cover the new functions.
165+
166+
#### E2E
167+
168+
- Test the resource scheduling suspension capability.
169+
- Test the resource scheduling resume capability.

0 commit comments

Comments
 (0)