You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When attempting to upgrade the system-upgrade-controller with over 300 pods, the system enters an infinite loop. This is caused by missing required Helm annotations in the ServiceAccount that prevent Helm from managing the resource properly.
To Reproduce
Deploy system-upgrade-controller ServiceAccount without proper Helm annotations
Try to upgrade using Helm with command similar to:
Observe the error and infinite loop behavior with >300 pods
Expected behavior
The ServiceAccount should include the proper Helm annotations to allow Helm to recognize and manage it during upgrades. The upgrade process should complete normally without entering an infinite loop.
Actual behavior
This results in the following error during upgrade:
Error: Unable to continue with install: ServiceAccount "system-upgrade-controller" in namespace "cattle-system" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "system-upgrade-controller"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "cattle-system"
The system then enters an infinite loop trying to reconcile this situation, particularly problematic when there are over 300 pods in the environment.
Correct ServiceAccount should include
This issue seems to be particularly severe in environments with many pods (300+). The infinite loop appears to be related to Helm's retry mechanism when it cannot properly manage existing resources due to missing annotations. Note that issues on RKE2 charts are currently disabled, so this bug report may need to be submitted through alternative channels.
The text was updated successfully, but these errors were encountered:
The Rancher charts for system-upgrade-controller are being moved into this repo as-is; improvements to the charts can come after they've been moved here.
Opening issues for charts that aren't even in this repo yet is premature. At the moment, the recommended way to deploy the SUC is using the manifest release artifacts, as described in the README and k3s/rke2 docs.
Version
v0.15.2
Platform/Architecture
linux-amd64
Describe the bug
When attempting to upgrade the system-upgrade-controller with over 300 pods, the system enters an infinite loop. This is caused by missing required Helm annotations in the ServiceAccount that prevent Helm from managing the resource properly.
To Reproduce
Deploy system-upgrade-controller ServiceAccount without proper Helm annotations
Try to upgrade using Helm with command similar to:
helm upgrade --history-max=5 --install=true --labels=catalog.cattle.io/cluster-repo-name=rancher-charts --namespace=cattle-system --reset-values=true --timeout=5m0s --values=/home/shell/helm/values-system-upgrade-controller-106.0.0.yaml --version=106.0.0 --wait=true system-upgrade-controller /home/shell/helm/system-upgrade-controller-106.0.0.tgz
Observe the error and infinite loop behavior with >300 pods
Expected behavior
The ServiceAccount should include the proper Helm annotations to allow Helm to recognize and manage it during upgrades. The upgrade process should complete normally without entering an infinite loop.
Actual behavior
Current ServiceAccount is defined as:
This results in the following error during upgrade:
Error: Unable to continue with install: ServiceAccount "system-upgrade-controller" in namespace "cattle-system" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "system-upgrade-controller"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "cattle-system"
The system then enters an infinite loop trying to reconcile this situation, particularly problematic when there are over 300 pods in the environment.
Correct ServiceAccount should include
Additional context
This issue seems to be particularly severe in environments with many pods (300+). The infinite loop appears to be related to Helm's retry mechanism when it cannot properly manage existing resources due to missing annotations. Note that issues on RKE2 charts are currently disabled, so this bug report may need to be submitted through alternative channels.
The text was updated successfully, but these errors were encountered: