Skip to content
This repository has been archived by the owner on Jan 7, 2025. It is now read-only.

Commit

Permalink
docs: add misc
Browse files Browse the repository at this point in the history
Signed-off-by: Alex Chi Z <[email protected]>
  • Loading branch information
skyzh committed Jan 16, 2024
1 parent 63dd5e6 commit ba22fa1
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 0 deletions.
4 changes: 4 additions & 0 deletions docs/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,7 @@

- [SQLPlannerTest](./sqlplannertest.md)
- [Datafusion CLI](./datafusion_cli.md)

# Miscellaneous

- [Miscellaneous](./miscellaneous.md)
31 changes: 31 additions & 0 deletions docs/src/miscellaneous.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Miscellaneous

This is a note covering things that do not work well in the system right now.

## Type System

Currently, we hard code decimal type to have `15, 2` precision. Type inferences should be done in the schema property inference.

## Expression

optd supports exploring SQL expressions in the optimization process. However, this might be super inefficient as optimizing a plan node (i.e., join to hash join) usually needs the full binding of an expression tree. This could have exponential plan space and is super inefficient.

## Bindings

We do not have something like a binding iterator as in the Cascades paper. Before applying a rule, we will generate all bindings of a group, which might take a lot of memory. This should be fixed in the future.

## Cycle Detection

Consider the case for join commute rule.

```
(Join A B) <- group 1
(Projection (Join B A) <expressions list>) <- group 2
(Projection (Projection (Join A B) <expressions list>) <expressions list>) <- group 1 may refer itself
```

After applying the rule twice, the memo table will have self-referential groups. Currently, we detect such self-referential things in optimize group task. Probably there will be better ways to do that.

## Partial Exploration

Each iteration will only be slower because we have to invoke the optimize group tasks before we can find a group to apply the rule. Probably we can keep the task stack across runs to make it faster.

0 comments on commit ba22fa1

Please sign in to comment.