-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
What would you like to be added:
Background
The Karmada scheduler watches changes in ResourceBinding
and Cluster
objects to determine whether to trigger a scheduling process. When watching bindings, the scheduler uses a change in metadata.generation
as the trigger to initiate scheduling.
Problem
The metadata.generation
is incremented whenever any field in spec
changes. Notably, spec.clusters
— which records the scheduling result — is part of the spec. After a successful scheduling round, the scheduler updates spec.clusters
with the selected clusters. This update increments metadata.generation
, which in turn triggers another unnecessary scheduling process.
karmada/pkg/scheduler/event_handler.go
Lines 173 to 180 in eeff07c
if oldMeta.GetGeneration() == newMeta.GetGeneration() { | |
if oldMeta.GetNamespace() != "" { | |
klog.V(4).Infof("Ignore update event of resourceBinding %s/%s as specification no change", oldMeta.GetNamespace(), oldMeta.GetName()) | |
} else { | |
klog.V(4).Infof("Ignore update event of clusterResourceBinding %s as specification no change", oldMeta.GetName()) | |
} | |
return | |
} |
This issue is particularly impactful for bindings with ReplicaSchedulingType=Duplicated
, as they proceed through the full scheduling pipeline. In contrast, bindings with ReplicaSchedulingType=Divided
will be filtered out earlier upon detection that rescheduling is not needed.
karmada/pkg/scheduler/scheduler.go
Lines 402 to 439 in eeff07c
if placementChanged(*rb.Spec.Placement, appliedPlacementStr, rb.Status.SchedulerObservedAffinityName) { | |
// policy placement changed, need schedule | |
klog.Infof("Start to schedule ResourceBinding(%s/%s) as placement changed", namespace, name) | |
err = s.scheduleResourceBinding(rb) | |
metrics.BindingSchedule(string(ReconcileSchedule), utilmetrics.DurationInSeconds(start), err) | |
return err | |
} | |
if util.IsBindingReplicasChanged(&rb.Spec, rb.Spec.Placement.ReplicaScheduling) { | |
// binding replicas changed, need reschedule | |
klog.Infof("Reschedule ResourceBinding(%s/%s) as replicas scaled down or scaled up", namespace, name) | |
err = s.scheduleResourceBinding(rb) | |
metrics.BindingSchedule(string(ScaleSchedule), utilmetrics.DurationInSeconds(start), err) | |
return err | |
} | |
if util.RescheduleRequired(rb.Spec.RescheduleTriggeredAt, rb.Status.LastScheduledTime) { | |
// explicitly triggered reschedule | |
klog.Infof("Reschedule ResourceBinding(%s/%s) as explicitly triggered reschedule", namespace, name) | |
err = s.scheduleResourceBinding(rb) | |
metrics.BindingSchedule(string(ReconcileSchedule), utilmetrics.DurationInSeconds(start), err) | |
return err | |
} | |
if rb.Spec.Replicas == 0 || | |
rb.Spec.Placement.ReplicaSchedulingType() == policyv1alpha1.ReplicaSchedulingTypeDuplicated { | |
// Duplicated resources should always be scheduled. Note: non-workload is considered as duplicated | |
// even if scheduling type is divided. | |
klog.V(3).Infof("Start to schedule ResourceBinding(%s/%s) as scheduling type is duplicated", namespace, name) | |
err = s.scheduleResourceBinding(rb) | |
metrics.BindingSchedule(string(ReconcileSchedule), utilmetrics.DurationInSeconds(start), err) | |
return err | |
} | |
// TODO: reschedule binding on cluster change other than cluster deletion, such as cluster labels changed. | |
if s.HasTerminatingTargetClusters(&rb.Spec) { | |
klog.Infof("Reschedule ResourceBinding(%s/%s) as some scheduled clusters are deleted", namespace, name) | |
err = s.scheduleResourceBinding(rb) | |
metrics.BindingSchedule(string(ReconcileSchedule), utilmetrics.DurationInSeconds(start), err) | |
return err | |
} | |
klog.V(3).Infof("Don't need to schedule ResourceBinding(%s/%s)", rb.Namespace, rb.Name) |
Why is spec.clusters
in spec
and not status
?
The clusters
field resides in spec
because the binding controller also treats it as part of the desired state — it needs to reconcile the actual workload distribution against this declared intent.
Goal
To avoid redundant scheduling, we should refine the filtering logic to trigger scheduling only when relevant spec fields (e.g., workload changes, propagation policy updates) change, rather than any spec change.
This issue tracks:
- Proposing a solution to eliminate redundant scheduling.
- Assessing potential negative impacts on the existing scheduling flow.
- Evaluating the effectiveness, e.g., improvement in scheduling QPS.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status