This repository was archived by the owner on Jan 7, 2025. It is now read-only.
Commit 204758e
authored
feat: caching optd stats, 12x speedup on TPC-H SF1 (#132)
**Summary**: Now caching the stat objects used by `OptCostModel`,
meaning we don't need to load data into DataFusion after doing it the
first time.
**Demo**:
12x speedup on TPC-H SF1 compared to not caching stats.
Caching everything _except_ optd stats takes 45.6s total.

Caching everything, _including_ optd stats, takes 3.9s total.

**Details**:
* This caching is **disabled by default** to avoid accidentally using
stale stats. I added a CLI arg to enable it.
* The main challenge of this PR was making `PerTableStats` a
serializable object for `serde`.
* The serializability refactor will also help down the line when we want
to **put statistics in the catalog**, since that is fundamentally a
serialization problem too. Having `Box<dyn ...>` would make putting
stats in the catalog more difficult.
* This required a significant refactor of how the `MostCommonValues` and
`Distribution` traits are handled in `OptCostModel`. Instead of having
`Box<dyn ...>` values in `PerColumnStats` which store any object that
implements these traits, I made `PerColumnStats` a templated object.
* The one downside of this refactor is that we can no longer have a
database which uses _different_ data structures for `Distribution` (like
a t-digest for one column, a histogram for another, etc.). I didn't see
this as a big enough reason to not do the refactor because it seems like
a rare thing to do. Additionally, if we really needed to do this, we
could just make an enum that had both types.1 parent 3477898 commit 204758e
File tree
18 files changed
+281
-199
lines changed- optd-core
- src
- optd-datafusion-repr/src
- bin
- cost
- plan_nodes
- optd-gungnir
- src/stats
- optd-perftest
- src
- tests
18 files changed
+281
-199
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
30 | 55 | | |
31 | 56 | | |
32 | 57 | | |
| |||
37 | 62 | | |
38 | 63 | | |
39 | 64 | | |
40 | | - | |
| 65 | + | |
41 | 66 | | |
42 | 67 | | |
43 | 68 | | |
| |||
57 | 82 | | |
58 | 83 | | |
59 | 84 | | |
60 | | - | |
| 85 | + | |
61 | 86 | | |
62 | 87 | | |
63 | 88 | | |
| |||
133 | 158 | | |
134 | 159 | | |
135 | 160 | | |
136 | | - | |
| 161 | + | |
137 | 162 | | |
138 | 163 | | |
139 | 164 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
2 | | - | |
| 1 | + | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
14 | 17 | | |
15 | 18 | | |
| 19 | + | |
| 20 | + | |
16 | 21 | | |
17 | 22 | | |
18 | 23 | | |
| |||
22 | 27 | | |
23 | 28 | | |
24 | 29 | | |
25 | | - | |
| 30 | + | |
26 | 31 | | |
27 | | - | |
| 32 | + | |
28 | 33 | | |
29 | 34 | | |
30 | 35 | | |
31 | | - | |
| 36 | + | |
32 | 37 | | |
33 | 38 | | |
34 | 39 | | |
| |||
56 | 61 | | |
57 | 62 | | |
58 | 63 | | |
59 | | - | |
| 64 | + | |
60 | 65 | | |
61 | 66 | | |
62 | 67 | | |
63 | | - | |
| 68 | + | |
64 | 69 | | |
65 | 70 | | |
66 | 71 | | |
| |||
74 | 79 | | |
75 | 80 | | |
76 | 81 | | |
77 | | - | |
| 82 | + | |
78 | 83 | | |
79 | 84 | | |
80 | 85 | | |
81 | 86 | | |
82 | 87 | | |
83 | 88 | | |
84 | 89 | | |
85 | | - | |
86 | | - | |
| 90 | + | |
| 91 | + | |
87 | 92 | | |
88 | 93 | | |
89 | 94 | | |
| |||
0 commit comments