Skip to content

Reduce redundant scheduling #6756

@zhzhuang-zju

Description

@zhzhuang-zju

What would you like to be added:

Background

The Karmada scheduler watches changes in ResourceBinding and Cluster objects to determine whether to trigger a scheduling process. When watching bindings, the scheduler uses a change in metadata.generation as the trigger to initiate scheduling.

Problem

The metadata.generation is incremented whenever any field in spec changes. Notably, spec.clusters — which records the scheduling result — is part of the spec. After a successful scheduling round, the scheduler updates spec.clusters with the selected clusters. This update increments metadata.generation, which in turn triggers another unnecessary scheduling process.

if oldMeta.GetGeneration() == newMeta.GetGeneration() {
if oldMeta.GetNamespace() != "" {
klog.V(4).Infof("Ignore update event of resourceBinding %s/%s as specification no change", oldMeta.GetNamespace(), oldMeta.GetName())
} else {
klog.V(4).Infof("Ignore update event of clusterResourceBinding %s as specification no change", oldMeta.GetName())
}
return
}

This issue is particularly impactful for bindings with ReplicaSchedulingType=Duplicated, as they proceed through the full scheduling pipeline. In contrast, bindings with ReplicaSchedulingType=Divided will be filtered out earlier upon detection that rescheduling is not needed.

if placementChanged(*rb.Spec.Placement, appliedPlacementStr, rb.Status.SchedulerObservedAffinityName) {
// policy placement changed, need schedule
klog.Infof("Start to schedule ResourceBinding(%s/%s) as placement changed", namespace, name)
err = s.scheduleResourceBinding(rb)
metrics.BindingSchedule(string(ReconcileSchedule), utilmetrics.DurationInSeconds(start), err)
return err
}
if util.IsBindingReplicasChanged(&rb.Spec, rb.Spec.Placement.ReplicaScheduling) {
// binding replicas changed, need reschedule
klog.Infof("Reschedule ResourceBinding(%s/%s) as replicas scaled down or scaled up", namespace, name)
err = s.scheduleResourceBinding(rb)
metrics.BindingSchedule(string(ScaleSchedule), utilmetrics.DurationInSeconds(start), err)
return err
}
if util.RescheduleRequired(rb.Spec.RescheduleTriggeredAt, rb.Status.LastScheduledTime) {
// explicitly triggered reschedule
klog.Infof("Reschedule ResourceBinding(%s/%s) as explicitly triggered reschedule", namespace, name)
err = s.scheduleResourceBinding(rb)
metrics.BindingSchedule(string(ReconcileSchedule), utilmetrics.DurationInSeconds(start), err)
return err
}
if rb.Spec.Replicas == 0 ||
rb.Spec.Placement.ReplicaSchedulingType() == policyv1alpha1.ReplicaSchedulingTypeDuplicated {
// Duplicated resources should always be scheduled. Note: non-workload is considered as duplicated
// even if scheduling type is divided.
klog.V(3).Infof("Start to schedule ResourceBinding(%s/%s) as scheduling type is duplicated", namespace, name)
err = s.scheduleResourceBinding(rb)
metrics.BindingSchedule(string(ReconcileSchedule), utilmetrics.DurationInSeconds(start), err)
return err
}
// TODO: reschedule binding on cluster change other than cluster deletion, such as cluster labels changed.
if s.HasTerminatingTargetClusters(&rb.Spec) {
klog.Infof("Reschedule ResourceBinding(%s/%s) as some scheduled clusters are deleted", namespace, name)
err = s.scheduleResourceBinding(rb)
metrics.BindingSchedule(string(ReconcileSchedule), utilmetrics.DurationInSeconds(start), err)
return err
}
klog.V(3).Infof("Don't need to schedule ResourceBinding(%s/%s)", rb.Namespace, rb.Name)

Why is spec.clusters in spec and not status?

The clusters field resides in spec because the binding controller also treats it as part of the desired state — it needs to reconcile the actual workload distribution against this declared intent.

Goal

To avoid redundant scheduling, we should refine the filtering logic to trigger scheduling only when relevant spec fields (e.g., workload changes, propagation policy updates) change, rather than any spec change.

This issue tracks:

  1. Proposing a solution to eliminate redundant scheduling.
  2. Assessing potential negative impacts on the existing scheduling flow.
  3. Evaluating the effectiveness, e.g., improvement in scheduling QPS.

Metadata

Metadata

Assignees

Labels

kind/featureCategorizes issue or PR as related to a new feature.

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions