Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SURE-9631] Bundle in an error state with message Resource is current #3342

Open
kkaempf opened this issue Feb 13, 2025 · 5 comments
Open

[SURE-9631] Bundle in an error state with message Resource is current #3342

kkaempf opened this issue Feb 13, 2025 · 5 comments
Labels
Milestone

Comments

@kkaempf
Copy link
Collaborator

kkaempf commented Feb 13, 2025

SURE-9631

Issue description:

The bundle is in an error state with the following message:

NotReady(1) [Cluster fleet-default/c-qtd7d]; helmchart.helm.cattle.io kube-system/cert-manager error] Resource is current; helmchart.helm.cattle.io kube-system/rancher-cis-benchmark error] Resource is current; helmchart.helm.cattle.io kube-system/rancher-cis-benchmark-crd error] Resource is current; helmchart.helm.cattle.io kube-system/rancher-logging error] Resource is current

Business impact:

It is currently showing that it is in error

Troubleshooting steps:

Force update the agent for that downstream cluster;
Add ignore to the fleet.yaml
See JIRA for more details

Repro steps:

I was not able to reproduce it.

Workaround:

Is a workaround available and implemented? yes/no No
What is the workaround:

Actual behavior:

It is currently in an error state

Expected behavior:

It should be green without any errors

@kkaempf kkaempf added this to the v2.12.0 milestone Feb 13, 2025
@kkaempf kkaempf added this to Fleet Feb 13, 2025
@github-project-automation github-project-automation bot moved this to 🆕 New in Fleet Feb 13, 2025
@manno
Copy link
Member

manno commented Feb 18, 2025

Related to rancher/rancher#48559 (comment)

Fleet (wrangler, too) does not support a "Failed=False" condition.

status:
  conditions:
  - message: Applying HelmChart using Job kube-system/helm-install-rancher-monitoring-crd
    reason: Job created
    status: "True"
    type: JobCreated
  - status: "False"
    type: Failed
  jobName: helm-install-rancher-monitoring-crd

This condition was added by the k3s helmcontroller in 0.15.2.
Apparently node resources can also have a status like this: longhorn/longhorn#7290 (comment)

I was unable to fix this by changing https://github.com/rancher/fleet/blob/main/internal/cmd/agent/deployer/summary/summarizers.go#L226. It happens in both, update status and drift detection.

However, the ignore statement seems to work: https://github.com/manno/fleet-experiments/blob/main/helmchartlogging/fleet.yaml

@lgruendh
Copy link

Appending the fleet.yaml seems to work for me.

ignore:
  conditions:
    - type: "Failed"
      status: "False"

Do we need to add this code to every single fleet.yaml, or is a fix coming in the next update?

The only problem is that dependsOn stops working if everything is in an error state.

@manno
Copy link
Member

manno commented Feb 19, 2025

Do we need to add this code to every single fleet.yaml, or is a fix coming in the next update?

Yes, this needs to be present in every fleet.yaml that deploys a bundle with resources that can have this status.
A fix won't make it into the next version, but we're scheduling this for a future release (currently on the 2.12 milestone).

@lgruendh
Copy link

We have quite a few HelmCharts, which are deployed via kind: HelmChart
Since the upgrade these are all stuck in Error:

status:
  conditions:
    - message: Applying HelmChart using Job kube-system/helm-install-kyverno
      reason: Job created
      status: 'True'
      type: JobCreated
    - status: 'False'
      type: Failed
  jobName: helm-install-kyverno

Our fleet is configured to set the configuration for a HelmChart after the HelmChart is installed correctly with a DependsOn.
This problem can be solved with the ignore.
But isn't it a general Problem that all kind: HelmChart are permanently stuck in an Error state?
In Rancher 2.9.2 everything was working fine, but now you tell me that HelmCharts are stuck in Error for Rancher 2.10.x and 2.11.x?

@erSitzt
Copy link

erSitzt commented Feb 21, 2025

I have these errors in a "naked" cluster, so none of the HelmChart resources are mine, to add to @lgruendh question... is this only a display issue or does this have any other implications ?

These errors appear both in my downstream and the local rancher cluster

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🆕 New
Development

No branches or pull requests

4 participants