feat: "hello world" selectivity computation #70

wangpatrick57 · 2024-02-16T15:52:59Z

Description: reason this PR just contains the simplest possible selectivity computation is because computing selectivity at all requires lots of refactoring, so I want that refactoring reviewed separately from the "core" selectivity computation logic.

Demo: passing test_colref_eq_constint_in_mcv test

optd-datafusion-repr/src/cost/base_cost.rs

"Core" means I am handling all cases where we don't fall back to hardcoded defaults (e.g. missing statistics, "col = col", "col = subquery", etc.). Note the difference between this PR and [hello world selectivity](cmu-db/optd#70). In this PR, the only file that changed is `base_cost.rs`, because all the "infrastructure" was already set up in "hello world selectivity" I have written 20 unit tests to test this core logic in a variety of scenarios (nulls vs no nulls, value in MCVs vs not in MCVs, reversing the order of children, etc.) I chose to write code-based unit tests instead of SQLPlanner-like tests because it allows fine-grained control of the expression tree down to the order of children (and because it takes a lot more work to modify SQLPlanner to output cardinality). I made an effort to make the unit tests less brittle by defining helper functions such as `bin_op()` or `const_i32()` for constructing expression trees. If our expression tree representation changes in the future, it is likely that only these helper functions need to be changed to make the unit tests work again. Some TPC-H queries require "non-core" logic, so this code doesn't currently run with all of TPC-H. To avoid crashing, I simply return INVALID_SELECTIVITY (0.001) for any branches that aren't a part of the "core". I will handle "non-core" logic in a future PR. Another case not being handled is comparisons (</<=/>=/>) between `Value` objects. Handling this requires further discussion because not all Values have a meaningful "order". In the meantime, I hardcoded converting all `Value` objects to `i32` to perform comparisons.

wangpatrick57 added 20 commits February 10, 2024 15:37

simple_manual_test.sql

d098c4a

added optimizer arg to compute_cost()

1603e2d

fixed not passing optimizer from adaptive to base

9459779

fixed not passing context from adaptive to base

b9a960c

comments

afd9d1a

Merge branch 'main' into hello-selectivity

cc91873

some prints

0d023ba

get_all_group_physical_bindings -> get_all_group_bindings

53fd631

merge

f6bc8d8

removed some prints

5ac24fe

merge with main

d6c9afb

added children group IDs to context

20c43fa

now getting columnref and expr_tree correctly

e1c9bfe

skeleton for calling get_sel

b1dadc8

added stats interface to repr

e3c5c1e

made MostCommonValues a trait

e923a41

added failing test_colref_eq_constint_in_mcv test

cffbca2

hello world get_selectivity (but not its helpers) done

67aee8e

done with basic get_column_equality_selectivity

d73ba3c

clippy

90149e2

wangpatrick57 changed the title ~~A working "hello world" selectivity computation~~ feat: "hello world" selectivity computation Feb 16, 2024

wangpatrick57 marked this pull request as ready for review February 16, 2024 17:37

wangpatrick57 added 6 commits February 16, 2024 12:41

fmt

e1f449a

sensible selectivity default

0aab774

ci.sh

87d3435

Merge branch 'main' into hello-selectivity

a74b746

refactored LogOp, LogOpType, and LogOpExpr out. merged it with BinOp*

c590994

comments

8d3c8f0

Gun9niR reviewed Feb 16, 2024

View reviewed changes

optd-datafusion-repr/src/cost/base_cost.rs Show resolved Hide resolved

Gun9niR reviewed Feb 17, 2024

View reviewed changes

optd-datafusion-repr/src/cost/base_cost.rs Show resolved Hide resolved

wangpatrick57 added 2 commits February 17, 2024 16:22

merge

68c8444

merged with main

72f90ec

wangpatrick57 merged commit 755db92 into main Feb 19, 2024

wangpatrick57 deleted the hello-selectivity branch February 19, 2024 23:35

wangpatrick57 mentioned this pull request Feb 20, 2024

feat: core filter selectivity #81

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: "hello world" selectivity computation #70

feat: "hello world" selectivity computation #70

wangpatrick57 commented Feb 16, 2024 •

edited

Loading

feat: "hello world" selectivity computation #70

feat: "hello world" selectivity computation #70

Conversation

wangpatrick57 commented Feb 16, 2024 • edited Loading

wangpatrick57 commented Feb 16, 2024 •

edited

Loading