Storage for Aggregating Functions #1390

norberttech · 2025-01-18T03:19:00Z

Currently, all Aggregating Functions are storing aggregated results in memory.
This is not fully aligned with Flow core philosophy which says that Flow is supposed to be memory efficient in the first place.

To solve that problem we should introduce a dedicated storage for mixed results of aggregating functions into which they can easily read/write to.

It should be configurable, meaning that end user should be able to chose if he wants to aggregate values in memory (super fast but not scalable) or to use one of the available storage strategies.

We can approach it in the similar way to ~~GroupBy~~ Sort:

when memory consumption < X - use memory
when memory consumption > X - use cache

We might be better using dedicated storage rather than Cache which is optimized for Rows. In this case, Redis-based storage might be a very good solution.

norberttech moved this to Todo in Roadmap Jan 18, 2025

norberttech added this to Roadmap Jan 18, 2025

norberttech added this to the 0.12.0 milestone Jan 18, 2025

norberttech added the performance label Jan 18, 2025

norberttech removed this from the 0.12.0 milestone Jan 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storage for Aggregating Functions #1390

Storage for Aggregating Functions #1390

norberttech commented Jan 18, 2025 •

edited

Loading

Storage for Aggregating Functions #1390

Storage for Aggregating Functions #1390

Comments

norberttech commented Jan 18, 2025 • edited Loading

norberttech commented Jan 18, 2025 •

edited

Loading