Skip to content

fix(metrics): sanitize custom field names before Prometheus & OTLP validation #1290

Open
ooye-sanket wants to merge 3 commits intofalcosecurity:masterfrom
ooye-sanket:master
Open

fix(metrics): sanitize custom field names before Prometheus & OTLP validation #1290
ooye-sanket wants to merge 3 commits intofalcosecurity:masterfrom
ooye-sanket:master

Conversation

@ooye-sanket
Copy link

What type of PR is this?

/kind bug


Any specific area of the project related to this PR?

/area config
/area outputs


What this PR does / why we need it:

Falco Sidekick currently panics when a custom field contains dots (e.g., k8s.cluster.name).
Prometheus and OTLP both reject such attribute/label names, causing:

  • OTLP metric attribute validation failures
  • Prometheus CounterVec cardinality mismatches (expected X values, got Y)
  • Full runtime panic when pushing metrics

The root cause was that custom fields were validated before sanitization.

✔ This PR fixes the issue by:

  • Sanitizing all custom field names using "." → "_" before validation
  • Applying sanitization consistently across:
    • handlers.go
    • stats_prometheus.go
    • otlp_metrics.go
    • OTLP.Metrics.ExtraAttributesList
  • Ensuring Prometheus and OTLP accept all user-provided fields safely
  • Updating logs to show both original and sanitized keys

This prevents crashes, ensures correct metric cardinality, and keeps behavior consistent across all backends.


Which issue(s) this PR fixes:

Fixes #3748


Special notes for your reviewer:

  • This change is backward-compatible
  • Only invalid-but-sanitizable field names are modified
  • Prometheus and OTLP now behave consistently
  • Helpful log messages added:
    "Custom field 'k8s.cluster.name' (sanitized to 'k8s_cluster_name') is not valid and will be ignored"

cc @maintainer

Replace dots with underscores in custom field names before regex 
validation to prevent label cardinality mismatches.

**Changes:**
- Line ~45-50: Sanitize customfields key using `strings.ReplaceAll(i, ".", "_")`
- Store sanitized name in variable for reuse in validation and append
- Update error log to show both original and sanitized field names

**Why:**
Custom fields with dots (e.g., `k8s.cluster.name`) were failing validation
but still being counted in expected label count, causing panic:
"inconsistent label cardinality: expected 8 label values but got 7"

**Related:**
- Part of fix for issue #3748
- Works in conjunction with handlers.go and otlp_metrics.go fixes

Signed-off-by: Sanket Kalekar <kalekarsanket005@gmail.com>
Replace dots with underscores in custom field names to ensure OTLP 
attribute name compliance and prevent cardinality panics.

Changes:
- Line ~34-40: Sanitize customfields key using `strings.ReplaceAll(i, ".", "_")`
- Store sanitized name in variable for reuse
- Change log level from ErrorLvl to WarningLvl (sanitization is automatic now)
- Update log message to show both original and sanitized field names

Why:
OTLP metric attribute names must match `^[a-zA-Z_:][a-zA-Z0-9_:]*$` 
regex. Dots cause validation failures and label count mismatches.

Related:
- Part of fix for issue #3748
- Complements prometheus and handlers.go fixes

Signed-off-by: Sanket Kalekar <kalekarsanket005@gmail.com>
Replace dots with underscores when building OTLP attribute list 
to ensure attribute name compliance.

Changes:
- Line ~230-237: In `newFalcoPayload()` function
- Sanitize key using `strings.ReplaceAll(key, ".", "_")` before:
  - Regex validation check
  - Creating attribute.String() with sanitized key

Why:
OTLP metric attributes must follow naming conventions. Without 
sanitization, custom fields with dots would be rejected, causing 
similar cardinality issues as Prometheus.

Related:
- Part of fix for issue #3748
- Complements otlp_metrics.go changes
- Prevents OTLP-related panics

Signed-off-by: Sanket Kalekar <kalekarsanket005@gmail.com>
@poiana poiana requested review from Issif and leogr December 4, 2025 10:16
@poiana
Copy link

poiana commented Dec 4, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ooye-sanket
Once this PR has been reviewed and has the lgtm label, please assign leogr for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana
Copy link

poiana commented Dec 4, 2025

Welcome @ooye-sanket! It looks like this is your first PR to falcosecurity/falcosidekick 🎉

@poiana poiana added the size/M label Dec 4, 2025
@leogr
Copy link
Member

leogr commented Dec 4, 2025

Hey @ooye-sanket

out of curiosity: is this patch AI generated? 🤔

@poiana
Copy link

poiana commented Mar 4, 2026

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants