Skip to content

[BUG] Cluster agents tries to authenticate against database on 7.71.2 #42781

@Conacious

Description

@Conacious

After upgrading our agent to 7.71.2 we identified the following errors on cluster-agent:

| ERROR | (comp/core/autodiscovery/autodiscoveryimpl/configmgr.go:220 in processNewConfig) | Unable to resolve secrets for config 'mysql', dropping check configuration, err: error while decrypting secrets in an instance: an error occurred while resolving 'file@/etc/secret/data': secret does not exist
stream closed EOF for monitoring/datadog-agent-dbm-cluster-agent-7ccc445544-wgcmn (init-volume)

We have delegated these checks to the clusterchecks. That's why the cluster-agent doesn't have mounted the secret because it shouldn't need it.

Agent Environment
Agent version: 7.71.2 (cluster-agent and cluster checks runner)
Install method: Helm (custom values below)
Cluster checks: Enabled (checks delegated to cluster checks runners)
Site: datadoghq.eu
Cloud: AWS

Describe what happened:
After upgrading to 7.71.2, the cluster-agent started failing to process a Cluster Check for MySQL because it tries to resolve the encrypted secret locally even though the check is marked as a cluster check and the secret is only mounted on the cluster checks runners.

Describe what you expected:
No errors on the cluster-agent since it doesn't need to execute mysql checks.

Steps to reproduce the issue:
Deploy Datadog via Helm with cluster checks enabled and the MySQL DBM check configured as a cluster check

Additional environment details (Operating System, Cloud provider, etc):
AWS
EKS-managed RDS MySQL (DBM)
Amazon Linux 2023 amazon-eks-node-al2023-x86_64-standard-1.32-v20251007

Relevant sections of the helm chart:

Cluster-agent:

    clusterAgent = {
      admissionController = {
        enabled = false
      }
      mutateUnlabelled = false
      resources = {
        limits = {
          memory = "512Mi"
        }
        requests = {
          cpu    = "50m"
          memory = "256Mi"
        }
      }
      image = {
        tag = local.datadog_agent_version
      }
      additionalLabels  = local.datadog_cluster_agent_kubernetes_labels
      priorityClassName = "system-cluster-critical"
      nodeSelector = {
        node_type = local.service_node_group_name
      }

      # MySQL monitoring configuration
      confd = {
        "mysql.yaml" = yamlencode({
          cluster_check = true
          init_config   = {}
          instances = [for instance in aws_rds_cluster_instance.cluster_instances : {
            dbm      = true
            host     = instance.endpoint
            port     = 3306
            username = local.dbm_datadog_username
            password = "ENC[file@${local.datadog_db_credentials_mount_path}/data]"
            collect_schemas = {
              enabled = true
            }
            collect_settings = {
              enabled = true
            }
            tags = ["env:${var.metadata.environment}"]
            options = {
              replication = !instance.writer
            }
          }]
        })
      }
    }

clusterChecksRunner:

    clusterChecksRunner = {
      enabled = true
      image = {
        tag = local.datadog_agent_version
      }
      nodeSelector = {
        node_type = local.service_node_group_name
      }
      volumeMounts = [
        {
          name      = local.datadog_db_credentials_secret_name
          mountPath = local.datadog_db_credentials_mount_path
        }
      ]
      volumes = [
        {
          name = local.datadog_db_credentials_secret_name
          secret = {
            secretName = local.datadog_db_credentials_secret_name
          }
        }
      ]
    }

Metadata

Metadata

Assignees

Labels

pendingLabel for issues waiting a Datadog member's response.team/container-platformThe Container Platform Team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions