-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug]: metric never push after Maximum data points for metric stream exceeded #2636
Comments
Part of #1065 Yes it was an unfortunate situation - 0.27 (and some versions before it) implemented Cardinality Capping, but offered no way to customize it. For 0.28, we removed cardinality capping feature. Will add it back along with the ability to customize the limit. Please upgrade to 0.28 to mitigate the issue. |
Sorry just seeing this now.. Even with CardinalityCapping, we still expect metrics to flow to Collector, just that dimensions are replaced with a special one indicating overflow has occurred. Can you detail more on the behavior you are seeing? |
Thank you for your response. Below is my error: level=error target=opentelemetry_sdk message= name=PeriodicReader.ExportFailed message="Failed to export metrics" reason="Metrics exporter otlp failed with the grpc server returns error (Some resource has been exhausted): , detailed error message: grpc: received message larger than max (7928423 vs. 4194304)", the question is:My otel_metric_export_interval=1000, and there are approximately 120 metrics. Each cycle reports these 120 metrics, so it quickly exceeds 2000. However, since the size has already been limited to 2000, why does it still exceed Tonic's maximum HTTP packet size of 4MB? |
This is unrelated to CardinalityCapping. This looks like the payload size is exceeding the maximum allowed by gRPC.
The 2000 limit is the cardinality cap, not size limitation. (Yes it in-turn does affect size). Can you share a minimal repro app? It takes a ton of metrics to exceed 4MB payload size... |
When I use opentelemetry-otlp 0.16.0, this error does not occur, and the data is reported successfully without any loss. However, when I use newer versions, this issue arises. Therefore, I suspect that the local metrics are not being cleared after reporting, causing the next report to merge with the previous one before being sent. |
'm running a simulation system where each cycle takes approximately 6 milliseconds. In each cycle, a metric is generated with 200 attributes to record the values that need to be reported for that cycle. However, after reporting, I noticed that the number of data points in the OpenTelemetry (OTel) collector keeps increasing with each cycle. When I used the same code to report metrics with opentelemetry-otlp 0.16, the collector processed 100 data points per second. But when I switched to a newer SDK version, the collector initially received 100 data points, then 200+ in the next cycle, followed by 300+, and the number kept growing. This makes me suspect that the client-side is accumulating metrics instead of clearing them between reports. |
I found the issue and resolved it by configuring Temporality::Delta. Here's the code: let exporter = MetricExporter::builder() |
In version 0.27.1, an error occurs when reporting metrics after a certain period of time: "Maximum data points for metric stream exceeded. Entry added to overflow. Subsequent overflows to same metric until next collect will not be logged." After this prompt appears, the OTEL collector no longer receives metrics from the client. However, this issue was not present in the previous OpenTelemetry OTLP version 0.16. In the subsequent source code analysis, I discovered that when this issue occurs, it results in an excessive number of metrics being reported at once. This causes the data volume in a single request to exceed 4MB, leading to timeouts or failures.
The text was updated successfully, but these errors were encountered: