Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xrate not good enough #18

Open
baryluk opened this issue Feb 12, 2023 · 3 comments
Open

xrate not good enough #18

baryluk opened this issue Feb 12, 2023 · 3 comments

Comments

@baryluk
Copy link

baryluk commented Feb 12, 2023

prometheus-2.39.1+0.0.3.linux-amd64

Some improvements, but this still needs to be fixed:

image

Scrape interval is 5s

@free
Copy link
Owner

free commented Feb 13, 2023

It is not entirely clear what you mean. Are you referring to the jittery ramp-up? The few small jitters later on? The fact that there exists a ramp-up at all?

@baryluk
Copy link
Author

baryluk commented Feb 18, 2023

Hi. Yes, the ramp-up.

I would prefer (in most situations, especially for graphing) to just use less points, than interpreting rate 5m as average over 5 minutes, and no data meaning 0, but rather, to mean smooth in 5m windows, but ignore non-existing points.

I know that with such a definition, variance at the start of timeseries might be higher, so maybe have a minimum threshold (1/4 of time range), or something.

It is a bit weird imho with current semantic.

@free
Copy link
Owner

free commented Feb 18, 2023

Consider that increase() is essentially implemented as rate() * interval. With your proposed approach, a metric that appears then disappears after a few minutes and increments by 1 per second would produce a total increase of 86400 over 1d,

Similarly, if you compute the rate over multiple hours of short-lived metrics, it would look as if they were increasing at their peak rate for the whole duration (e.g. a metric that appears and disappears after 5 minutes would result in a 65 minute line if you used rate(metric[1h]) on it). Another way or looking at it is that one would expect the area under a chart to be proportional to the increase of the counter(the rate is the derivative of the counter; the counter is the integral of the rate). The 65 minute constant rate for a 5-minute counter hugely overestimates the counter increase.

Plus, there's the potentially huge variance when the metric appears/disappears.

It's not that one approach is strictly better than the other. But depending on exactly what one wants, one approach might be preferable to the other. And the truth is that switching the current behavior for a new one is likely to surprise/annoy more people and break more alerts/dashboards than retaining it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants