-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
What happened:
When the scheduling constraints of a workload (e.g., nodeAffinity, resourceRequests) are modified, rendering previously assigned clusters unsuitable, the Karmada scheduler may still prioritize these clusters during scaling operations. This occurs because the current logic for calculating a cluster's "available replicas" incorporates historically assigned replica counts (rbSpec.AssignedReplicasForCluster). For example, if a Deployment initially had 10 replicas on Cluster A, and a constraint change now makes Cluster A unsuitable, the scheduler might still calculate Cluster A's total capacity as 10 (assigned) + 3 (newly estimated) = 13. This can lead to suboptimal scheduling decisions, such as assigning new replicas to Cluster A instead of more suitable clusters like Cluster B, which might have higher actual capacity under the new constraints. This behavior is particularly impactful when using the Aggregated replica scheduling strategy, which favors clusters with higher total perceived capacity .
What you expected to happen:
When scheduling constraints are updated, the scheduler should primarily rely on the current estimated available replicas (from calAvailableReplicasFunc) under the new constraints, not historical assignment data. Clusters that no longer meet the new constraints should have their historically assigned replicas disregarded or heavily discounted in capacity calculations. The scheduler should prioritize clusters that best satisfy the updated constraints for new replica assignments, ensuring optimal placement and avoiding placement on incompatible clusters.
How to reproduce it (as minimally and precisely as possible):
- Deploy a Deployment with 10 replicas using a PropagationPolicy with replicaDivisionPreference: Aggregated. Ensure it schedules all replicas to Cluster A (e.g., using clusterAffinity or labels matching Cluster A).
- Modify the Pod template's nodeAffinity (or other constraints like resource.requests) such that Cluster A no longer satisfies the new constraints (e.g., require a label only present on Cluster B).
- Scale the Deployment to 15 replicas.
- Observe scheduler behavior: instead of prioritizing Cluster B (which now satisfies the constraints), the scheduler may still assign additional replicas to Cluster A due to the inflated "available replica" count from historical assignments.
Anything else we need to know?:
- The problem might be more pronounced when using estimators like the "accurate estimator" , if historical data influences their calculations.
Environment:
Karmada version: master (likely affects versions incorporating the historical replica assignment logic)
kubectl-karmada or karmadactl version: NA
Others: The issue might be more observable when using accurate estimation.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status