Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: metric never push after Maximum data points for metric stream exceeded #2636

Open
bobbyliyao opened this issue Feb 10, 2025 · 4 comments
Labels
A-metrics Area: issues related to metrics M-exporter-otlp question Further information is requested

Comments

@bobbyliyao
Copy link

bobbyliyao commented Feb 10, 2025

In version 0.27.1, an error occurs when reporting metrics after a certain period of time: "Maximum data points for metric stream exceeded. Entry added to overflow. Subsequent overflows to same metric until next collect will not be logged." After this prompt appears, the OTEL collector no longer receives metrics from the client. However, this issue was not present in the previous OpenTelemetry OTLP version 0.16. In the subsequent source code analysis, I discovered that when this issue occurs, it results in an excessive number of metrics being reported at once. This causes the data volume in a single request to exceed 4MB, leading to timeouts or failures.

@cijothomas
Copy link
Member

Part of #1065

Yes it was an unfortunate situation - 0.27 (and some versions before it) implemented Cardinality Capping, but offered no way to customize it.

For 0.28, we removed cardinality capping feature. Will add it back along with the ability to customize the limit.
(customizing the limit often requires Views concept, which is an experimental capability in this repo. So bringing back cardinality capping would require more work as we need to stabilize Views too.)

Please upgrade to 0.28 to mitigate the issue.

@cijothomas
Copy link
Member

After this prompt appears, the OTEL collector no longer receives metrics from the client.

Sorry just seeing this now.. Even with CardinalityCapping, we still expect metrics to flow to Collector, just that dimensions are replaced with a special one indicating overflow has occurred. Can you detail more on the behavior you are seeing?

@bobbyliyao
Copy link
Author

Thank you for your response. Below is my error: level=error target=opentelemetry_sdk message= name=PeriodicReader.ExportFailed message="Failed to export metrics" reason="Metrics exporter otlp failed with the grpc server returns error (Some resource has been exhausted): , detailed error message: grpc: received message larger than max (7928423 vs. 4194304)", the question is:My otel_metric_export_interval=1000, and there are approximately 120 metrics. Each cycle reports these 120 metrics, so it quickly exceeds 2000. However, since the size has already been limited to 2000, why does it still exceed Tonic's maximum HTTP packet size of 4MB?

@cijothomas
Copy link
Member

This is unrelated to CardinalityCapping. This looks like the payload size is exceeding the maximum allowed by gRPC.

However, since the size has already been limited to 2000

The 2000 limit is the cardinality cap, not size limitation. (Yes it in-turn does affect size).

Can you share a minimal repro app? It takes a ton of metrics to exceed 4MB payload size...

@cijothomas cijothomas added question Further information is requested A-metrics Area: issues related to metrics M-exporter-otlp labels Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-metrics Area: issues related to metrics M-exporter-otlp question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants