Gaps in Cortex Metrics in Grafana Dashboards for Time Intervals longer than 6 Hours

We are running Cortex in a dedicated EKS cluster.
More than 70 other clusters send their metrics to this Cortex instance.
Each cluster’s Grafana is configured to query Cortex for data visualization.

For the past couple of months, we have been observing gaps in Grafana panels for time ranges longer than 6 hours (this only has been observed for our biggest tenant - around 27.7 Mil series).

<img width="1036" height="487" alt="Image" src="https://github.com/user-attachments/assets/d15cd4bf-fd3a-49b2-848d-87276d802593" />

<img width="1125" height="528" alt="Image" src="https://github.com/user-attachments/assets/75d36bbc-0276-44cb-8006-218c4c49ad6a" />

<img width="1862" height="680" alt="Image" src="https://github.com/user-attachments/assets/d3f0eec0-4aeb-416e-a865-3e79f32dd7ef" />

There are no missing metrics — all data is successfully received by Cortex.
It appears the issue is related to metrics caching.
We’ve noticed that restarting the Memcached frontend resolves the problem temporarily — after the restart, the gaps disappear.

Memcached-fronted config:
```
    query_range:
      cache_results: true
      results_cache:
        cache:
          memcached_client:
            host: cortex-infra-memcached-frontend.cortex-infra.svc.cluster.local
            timeout: 3s
            max_idle_conns: 200
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gaps in Cortex Metrics in Grafana Dashboards for Time Intervals longer than 6 Hours #7045

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Gaps in Cortex Metrics in Grafana Dashboards for Time Intervals longer than 6 Hours #7045

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions