This guide walks through every breaking change between the v3 and v4 releases of the x509-certificate-exporter. The order below reflects operational priority: recommended migration steps first, then chart values shape, then metrics changes, then the rest.
| Area | v3 | v4 |
|---|---|---|
| Helm chart distribution | Hybrid: https://charts.enix.io index pointing at oci://quay.io/enix/charts/... artifacts |
Same hybrid still works — going full OCI recommended |
| Container image variants | busybox (default), alpine, scratch |
scratch (default, minimal), busybox (alt, with shell) — Alpine retired |
| Image registries | Docker Hub, Quay | Docker Hub, Quay, GHCR |
| Exporter configuration | CLI flags | YAML config file (--config) |
Helm secretTypes shape |
{type, key} |
{type, key, format, pkcs12} — format and pkcs12 are optional |
x509_read_errors metric |
Single gauge per error | Removed — see metrics section |
| Per-source observability | None | x509_source_up, x509_source_bundles, x509_source_errors_total{reason} |
No breaking change here — chart releases have been published as OCI
artifacts for some time already. The classic index at
https://charts.enix.io is a thin hybrid: an HTTP index.yaml whose
urls: entries point straight at oci://quay.io/enix/charts/.... It
keeps working in v4 and will keep receiving releases, so no action is
required.
We do recommend pointing your tooling directly at the OCI reference at your own pace. The hybrid index adds an extra hop with no upside, and OCI is where cosign signatures, SBOM attestations and provenance live — verifying them is straightforward when you pull the artifact directly.
Tip
If your tooling already supports OCI (Helm 3.8+, Argo CD 2.6+, Flux source-controller 1.0+), the switch is a one-line change and unlocks the full supply-chain verification story documented in the hardening guide.
# Remove the legacy repo entry if you had it:
helm repo remove enix
helm install x509-certificate-exporter \
oci://quay.io/enix/charts/x509-certificate-exporter \
--version 4.0.0Argo CD and Flux snippets:
# Argo CD Application
spec:
source:
repoURL: quay.io/enix/charts
chart: x509-certificate-exporter
targetRevision: 4.0.0
# NOTE: Argo CD treats OCI registries differently from HTTP repos —
# `repoURL` must be the registry+namespace, without the `oci://`
# scheme and without the chart name; `chart:` carries the chart name.# Flux HelmRelease
spec:
chart:
spec:
chart: x509-certificate-exporter
version: "4.0.0"
sourceRef:
kind: HelmRepository # type: oci
name: enix-charts
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: enix-charts
spec:
type: oci
url: oci://quay.io/enix/chartshelm repo add enix https://charts.enix.io
helm repo update
helm upgrade x509-certificate-exporter enix/x509-certificate-exporter \
--version 4.0.0No other change needed — the chart name and values shape are identical across both distribution channels.
In v3, enix/x509-certificate-exporter was the canonical reference
(implicitly Docker Hub). v4 publishes the same image to three
registries simultaneously, with byte-identical contents and signatures:
quay.io/enix/x509-certificate-exporter(chart default in v4)ghcr.io/enix/x509-certificate-exporterdocker.io/enix/x509-certificate-exporter(enix/x509-certificate-exportershort form)
Important
The chart's default image.registry flips from docker.io to
quay.io in v4. A plain v3 → v4 upgrade therefore changes the
registry the kubelet pulls from. If your cluster relies on
ImagePullSecrets, registry mirrors, network policy egress rules, or
admission webhooks scoped to Docker Hub, override image.registry
back to docker.io (or any of the three) — all three registries are
kept in lockstep and serve byte-identical images and signatures.
# values.yaml — opt back to Docker Hub if your environment requires it
image:
registry: docker.ioPick whichever fits your egress / pull-rate-limit policy. Otherwise no edit is needed.
v3 published three variants via tag suffix (-busybox, -alpine,
-scratch). v4 retires the Alpine variant and flips the
default:
scratch(default — emptytagSuffix). Distroless, no shell, no utilities. Smallest image; usekubectl debugto inspect a running pod when needed.busybox(setimage.tagSuffix: -busybox). Ships/bin/sh,wget,cat, etc., forkubectl execdebugging.
Important
Plain upgrades that don't override image.tagSuffix will move from
the busybox base to scratch. If your operational tooling assumes a
shell in the running pod (init scripts, kubectl exec-based
diagnostics, sidecars with shared command:/args:), set
image.tagSuffix: -busybox to stay on the previous base.
If you currently set image.tagSuffix: -alpine, switch to one of:
image.tagSuffix: ""→scratch. Distroless, smallest, no shell.image.tagSuffix: -busybox→ closest replacement for Alpine: still has/bin/sh,wget,cat, etc.
The simple {type, key} form keeps working unchanged: a v3 user with
# values.yaml — v3 form, still valid in v4
secretTypes:
- type: kubernetes.io/tls
key: tls.crtneeds no edit. The shape was extended to support new sources:
# values.yaml — v4 extended form
secretTypes:
# Plain PEM in any Secret type:
- type: kubernetes.io/tls
key: tls.crt
# Multiple keys via a regex pattern (alternative to a single `key`):
- type: Opaque
keyPatterns:
- '^.*\.crt$'
# PKCS#12 keystore with the passphrase in a sibling key:
- type: Opaque
key: keystore.p12
format: pkcs12 # default is "pem"
pkcs12:
passphraseKey: keystore-passphrase # sibling key in the same Secret
# OR passphraseFile: /path/in/container
# OR passphraseSecretRef: { namespace, name, key }
# OR tryEmptyPassphrase: trueIf you don't use PKCS#12 or regex matching, ignore this — the v3 form is still parsed.
service.headless flips from false (v3) to true (v4). The Service
is now created with clusterIP: None by default, which is the right
shape for a metrics endpoint: a Pod-level discovery anchor, not a
load-balanced application.
In practice, this changes nothing for the dominant scrape paths:
prometheus-operatorServiceMonitorandPodMonitordiscover Pods via Endpoints / EndpointSlices and scrape Pod IPs directly — the Service's ClusterIP is irrelevant.- Prometheus'
kubernetes_sd_configsdoes the same. kubectl port-forward svc/x509-certificate-exporter 9793still works (kubectl picks a backing Pod).
The one case where the change is observable is a Prometheus job using
dns_sd_configs (or a static_configs with the Service DNS name):
the resolver now returns one A record per Pod instead of a single
ClusterIP. If you have replicas > 1 and were relying on the
ClusterIP's round-robin to distribute scrapes, you almost certainly
had a latent bug — only one Pod was scraped per attempt. Headless
fixes that.
To restore the v3 behaviour explicitly:
service:
headless: falseNote
Kubernetes refuses to mutate a Service's clusterIP in place. A naive
helm upgrade from v3 (with an assigned ClusterIP) to v4 (headless,
clusterIP: None) would fail with spec.clusterIPs[0]: may not change once set. The chart ships a pre-upgrade hook that detects a v3 release
via the existing Service's helm.sh/chart label and deletes the old
Service, Deployment, and DaemonSets before the upgrade reconciles, so
helm can recreate them cleanly from the new templates (the v4 label
rework also rotates the immutable Deployment/DaemonSet selectors). A
few seconds of scrape disruption during the upgrade window; Pods
themselves stay up.
Every image block (image, migration.image, rbacProxy.image) now
exposes the same five fields: registry, repository, tag,
digest, pullPolicy (plus tagSuffix on image for the
scratch/busybox flavor switch). When set, digest takes precedence
over tag and produces a registry/repository@sha256:... reference —
the recommended form in production once you have run cosign verify.
One small schema break: migration.image.repository used to bundle the
registry (registry.k8s.io/kubectl); v4 splits it into
registry: registry.k8s.io + repository: kubectl. If you override
the kubectl image in values.yaml, update both keys.
Every component (secretsExporter, hostPathsExporter, migration,
rbacProxy) now ships with the extra fields the Pod Security Admission
restricted profile requires:
podSecurityContext.seccompProfile.type: RuntimeDefaultsecurityContext.allowPrivilegeEscalation: false
The chart was already running as runAsNonRoot: true with drop: [ALL]
capabilities and readOnlyRootFilesystem: true; the two new fields
close the remaining PSA gap. No action required for stock deployments
— modern container runtimes (containerd, CRI-O, Docker) ship a
permissive default seccomp profile that a plain Go HTTP server
runs under unchanged. If you have an unusual cluster that
forbids seccompProfile: RuntimeDefault, override per-component:
secretsExporter:
podSecurityContext:
seccompProfile: nullv4 exposes a small HTML status page at / (cache stats, source health,
process info). It's enabled by default in the chart (web.enableStats: true). Disable it with --set web.enableStats=false if you want strict
metrics-only exposure.
v3 exposed a top-level kubeVersion: "" knob to override the cluster
version detection — useful only with helm template. Helm 3.10+
exposes the --kube-version CLI flag for the same purpose, so v4
drops the value. If you set kubeVersion: in values.yaml, remove it
and pass --kube-version v1.29.0 (or similar) when rendering.
v3 was driven entirely by CLI flags (--watch-dir, --secret-type,
--include-namespace, …). v4 makes a YAML config file the source of
truth (--config /etc/x509-exporter/config.yaml). A subset of the v3
flags (--watch-file, --watch-dir, --watch-kubeconf,
--watch-kube-secrets, --listen-address, --web.config.file,
--debug, --profile) still works as ergonomic shortcuts mapped
onto the YAML schema at parse time, but their use is deprecated and
not recommended — they may be removed in a future release. The
richer flags (--secret-type, --include-namespace, …) have no v4
CLI equivalent: express those rules in YAML.
If you deploy via the Helm chart, you don't see this: the chart
generates the YAML config from the same values.yaml you've always
edited and ships it as a ConfigMap. The break applies only to:
- Custom CLI invocations (systemd units, dev environments running the binary directly).
- Forks of the chart that assemble flags by hand instead of going through the official chart's templates.
For standalone use, see docs/faq.md (the
"non-Kubernetes host" entry) and the exhaustive dev/values.yaml for
every supported source kind.
v3 had a single gauge x509_read_errors that counted failed reads. It's
removed in v4 because the new
x509_source_errors_total{source_kind, source_name, reason} counter
gives strictly more information (per-source, per-reason, monotonically
increasing).
Update your alerts and dashboards:
# v3
x509_read_errors > 0
# v4
increase(x509_source_errors_total[15m]) > 0
The reason label gives a stable error code (e.g.
source_unreachable, parse_failed, passphrase_wrong,
invalid_key_format) — useful for routing alerts.
These v3 metrics are kept verbatim in v4. PromQL queries, alerts and dashboards using them continue to work:
x509_cert_not_afterx509_cert_expired(now gated bymetrics.exposeExpired, on by default — set tofalseif your alerts only consumex509_cert_not_after - time()and you want to halve the per-cert series count)x509_cert_not_before(now gated bymetrics.exposeNotBefore, off by default — turn on to detect certificates issued in the future or clock-skew issues)x509_cert_expires_in_seconds(still gated bymetrics.exposeRelative)x509_cert_valid_since_seconds(still gated bymetrics.exposeRelative)x509_cert_error(still gated bymetrics.exposePerCertError)x509_exporter_build_info
The per-certificate label set (subject_CN, issuer_CN,
serial_number, secret_*, filepath, surfaced Secret labels via
exposeLabels) is unchanged. v4 adds configmap_name,
configmap_namespace for the new ConfigMap source kind.
Important
x509_cert_not_before is off by default in v4. v3 always
emitted it. If you have an alert/dashboard panel relying on it, set
metrics.exposeNotBefore: true (chart: exposeNotBeforeMetric: true)
to keep the series.
These series are entirely new and worth wiring into your dashboards:
| Metric | Type | Default | Use it for |
|---|---|---|---|
x509_source_up{source_kind, source_name} |
gauge | on | Per-source liveness — == 0 means a source has stopped reporting |
x509_source_bundles{source_kind, source_name} |
gauge | on | Number of bundles (Secrets, files, etc.) currently held by each source |
x509_source_errors_total{source_kind, source_name, reason} |
counter | on | Per-source, per-reason error count (replaces x509_read_errors) |
x509_kube_watch_resyncs_total{source_name, resource} |
counter | on | API watch resyncs / 410 Gone events; sustained increase signals an unhealthy watch (network, apiserver) |
x509_scrape_duration_seconds |
histogram | on | Total time to serve a /metrics request |
x509_panic_total{component} |
counter | on | Recovered goroutine panics; should always be 0 in steady state |
x509_kube_request_duration_seconds{verb, resource} |
histogram | gated by metrics.exposeDiagnostics |
client-go API call latency |
x509_kube_informer_scope{source_name, scope} |
gauge | gated by metrics.exposeDiagnostics |
Whether the source is namespace-scoped or cluster-scoped (legacy metric name kept for dashboard compatibility) |
x509_informer_queue_depth{source_name, resource} |
gauge | gated by metrics.exposeDiagnostics |
Real-time event-queue depth — populated by the namespace informer when label-based namespace rules are configured |
x509_parse_duration_seconds{format} |
histogram | gated by metrics.exposeDiagnostics |
Per-format (PEM / PKCS#12) parse latency |
x509_pkcs12_passphrase_failures_total{source_name} |
counter | auto (only if a source declares format: pkcs12) |
Specific to PKCS#12; a sustained increase usually means a Secret was rotated but the passphrase wasn't |
The chart's bundled PrometheusRule already references the new metrics;
re-enable rendering with prometheusRules.create=true to pick up the
defaults.
Note
The four metrics gated by metrics.exposeDiagnostics (chart:
exposeDiagnosticMetrics) are useful when troubleshooting the
exporter itself, not when monitoring certificates. Default off keeps
/metrics lean; flip on whenever you need to look inside.
v4 substantially reduced memory footprint and parsing redundancy. No configuration is required to benefit from the changes; we document them here so you understand what changed if you notice lower memory usage post-upgrade.
- Direct paginated LIST + WATCH. v3 polled the API server on a
fixed cadence; the early v4 prototype used a SharedInformer cache,
which OOM'ed on clusters with many large Helm release secrets
because client-go's
pager.Listaccumulates every page in memory before yielding. v4 ships a direct paginated LIST + WATCH loop that processes each page (default 50 objects, tunable viasecretsExporter.listPageSize) before fetching the next, capping peak sync memory to roughlypageSize × average object size. - Server-side filtering. v4 pushes label and field selectors onto
every LIST and WATCH call. When all secret rules share a single
Type (e.g.
kubernetes.io/tls), afieldSelector=type=...is applied automatically — the API server never returns Helm release secrets, ServiceAccount tokens or docker configs. Combined with namespace include/exclude (by name or by namespace label), this is the lever for clusters with tens of thousands of Secrets. - Adaptive source scope. When a config restricts to a single
literal namespace, v4 scopes the LIST + WATCH to that namespace
rather than going cluster-wide. The
x509_kube_informer_scopemetric (legacy name kept for dashboard compatibility) exposes the decision. - Memoization by
ResourceVersion. v3 re-parsed every Secret on every change event. v4 keeps a per-object hash of the bundle keyed byResourceVersion; a watch event for an unchanged Secret short-circuits through the cache. - Watch bookmarks. Bookmarks let the API server advance the watch resource version without sending object updates, so reconnections resume from a recent point instead of triggering a full re-LIST.
- Namespace label change → immediate re-list. When namespace labels change, v4 short-circuits the resync timer and re-lists secrets and configmaps right away so newly-allowed objects appear without waiting up to 30 minutes.
- File-source poll cache. The new
cache.filePoll.skipUnchangedshort-circuits parsing when(mtime, size, inode)hasn't moved since the last poll — meaningful when watching/etc/letsencrypt/live/...with thousands of certs.
The chart's default resources reflect the new footprint:
| Component | Request | Limit |
|---|---|---|
| Cluster-wide Secrets/ConfigMaps exporter | 20 Mi | 150 Mi |
| Node-local hostPath exporter (per DaemonSet) | 20 Mi | 40 Mi |
If you bumped these in v3 because the exporter was struggling on a large
cluster, lower them back and watch x509_source_bundles — the actual
number of bundles in cache is the load metric to size against.
Beyond the breaking changes above, v4 ships a few additions that are worth mentioning during a migration:
/healthzand/readyzare now first-class endpoints alongside/metrics. The chart wires both into the Pod's probes by default. v3 deployments that disabled probes (because they didn't work) can re-enable them with no extra config.- ConfigMap watching is native. v3 needed
--configmap-keys; v4 treats ConfigMaps as a regular source kind, with the same filtering options as Secrets (label selectors, namespace include/exclude).
If you run the binary on bare metal / VMs / systemd units rather than in Kubernetes, two things change:
-
Prefer a YAML config file. A handful of v3-era CLI shortcuts (
--watch-file,--watch-dir,--watch-kubeconf, …) still work, but they're a deprecated surface: they don't expose every v4 capability (PKCS#12, ConfigMaps, regex key patterns, namespace labels, per-source rate limits, …) and may be removed in a future release. Write a small YAML config and pass it via--configto insulate yourself from future churn and to unlock the full feature set:server: listen: :9793 sources: - kind: file name: host-pki paths: - /etc/ssl/certs/*.pem - /etc/letsencrypt/live/*/fullchain.pem
x509-certificate-exporter --config /etc/x509-exporter.yaml
-
Pre-built binaries are still attached to every GitHub Release. v4 keeps the matrix to Linux, macOS, Windows, FreeBSD, OpenBSD, NetBSD, Illumos and Solaris across
amd64,arm64,armv7andriscv64(with exclusions for non-existent OS/arch combos). The binary name isx509-certificate-exporter(unchanged across forks). SLSA-3 provenance attestations (queryable viagh attestation verify) and SHA256 checksums are published alongside each binary — see the hardening guide for the verification recipe.
This path is supported, but most operational documentation in v4 is
written assuming the Helm chart. If you need a feature that isn't
exposed in the standalone YAML config (e.g. tryEmptyPassphrase on a
file-based PKCS#12 keystore), open an issue — it's likely just missing
documentation rather than a missing feature.