Skip to content
This repository was archived by the owner on Jan 7, 2025. It is now read-only.

feat: "hello world" selectivity computation #70

Merged
merged 28 commits into from
Feb 19, 2024
Merged

Conversation

wangpatrick57
Copy link
Member

@wangpatrick57 wangpatrick57 commented Feb 16, 2024

Description: reason this PR just contains the simplest possible selectivity computation is because computing selectivity at all requires lots of refactoring, so I want that refactoring reviewed separately from the "core" selectivity computation logic.

Demo: passing test_colref_eq_constint_in_mcv test
Screenshot 2024-02-16 at 12 37 56

@wangpatrick57 wangpatrick57 changed the title A working "hello world" selectivity computation feat: "hello world" selectivity computation Feb 16, 2024
@wangpatrick57 wangpatrick57 marked this pull request as ready for review February 16, 2024 17:37
@wangpatrick57 wangpatrick57 merged commit 755db92 into main Feb 19, 2024
@wangpatrick57 wangpatrick57 deleted the hello-selectivity branch February 19, 2024 23:35
wangpatrick57 referenced this pull request Feb 20, 2024
"Core" means I am handling all cases where we don't fall back to
hardcoded defaults (e.g. missing statistics, "col = col", "col =
subquery", etc.).

Note the difference between this PR and [hello world
selectivity](cmu-db/optd#70). In this PR, the
only file that changed is `base_cost.rs`, because all the
"infrastructure" was already set up in "hello world selectivity"

I have written 20 unit tests to test this core logic in a variety of
scenarios (nulls vs no nulls, value in MCVs vs not in MCVs, reversing
the order of children, etc.) I chose to write code-based unit tests
instead of SQLPlanner-like tests because it allows fine-grained control
of the expression tree down to the order of children (and because it
takes a lot more work to modify SQLPlanner to output cardinality). I
made an effort to make the unit tests less brittle by defining helper
functions such as `bin_op()` or `const_i32()` for constructing
expression trees. If our expression tree representation changes in the
future, it is likely that only these helper functions need to be changed
to make the unit tests work again.

Some TPC-H queries require "non-core" logic, so this code doesn't
currently run with all of TPC-H. To avoid crashing, I simply return
INVALID_SELECTIVITY (0.001) for any branches that aren't a part of the
"core". I will handle "non-core" logic in a future PR.

Another case not being handled is comparisons (</<=/>=/>) between
`Value` objects. Handling this requires further discussion because not
all Values have a meaningful "order". In the meantime, I hardcoded
converting all `Value` objects to `i32` to perform comparisons.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants