Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support uses of BACK that cause correlated references: setup decorrelation handling #251

Merged
merged 148 commits into from
Feb 18, 2025
Merged
Show file tree
Hide file tree
Changes from 139 commits
Commits
Show all changes
148 commits
Select commit Hold shift + click to select a range
deeb914
Starting function list documentation
knassre-bodo Jan 13, 2025
5d6c513
Adding datetime functions and bad boolean tests
knassre-bodo Jan 13, 2025
170492e
Merge branch 'main' into kian/function_docs
knassre-bodo Jan 13, 2025
8b8c098
Adding remaining functions including agg/window
knassre-bodo Jan 13, 2025
cb83f1f
Adding toc
knassre-bodo Jan 13, 2025
ef39e6e
Adding toc
knassre-bodo Jan 13, 2025
3c79543
Fixing typo [RUN CI]
knassre-bodo Jan 13, 2025
3db13d1
Started DSL documentation
knassre-bodo Jan 13, 2025
8dc17b3
Adding calc, contextless, and back
knassre-bodo Jan 13, 2025
044c217
Added TOC and TODOs
knassre-bodo Jan 13, 2025
0252cee
Added TOC and TODOs
knassre-bodo Jan 13, 2025
98c722b
Changing highlighting
knassre-bodo Jan 13, 2025
aa86d31
Merge branch 'kian/function_docs' into kian/dsl_docs
knassre-bodo Jan 13, 2025
b03a5ec
Addded more examples
knassre-bodo Jan 13, 2025
b569bfb
Added WHERE, ORDER_BY, and TOP_K documentation
knassre-bodo Jan 14, 2025
7607fea
Fixing typo
knassre-bodo Jan 14, 2025
215735b
Added extra example
knassre-bodo Jan 14, 2025
0b8e1fb
Update pydough/unqualified/unqualified_node.py
knassre-bodo Jan 14, 2025
a5190f8
Update documentation/functions.md
knassre-bodo Jan 14, 2025
a0d43f3
Update documentation/functions.md
knassre-bodo Jan 14, 2025
24c1e4a
Update documentation/functions.md
knassre-bodo Jan 14, 2025
8009e05
Update documentation/functions.md
knassre-bodo Jan 14, 2025
5d87b59
Update documentation/functions.md
knassre-bodo Jan 14, 2025
a3ee536
Update documentation/functions.md
knassre-bodo Jan 14, 2025
9945752
Update documentation/functions.md
knassre-bodo Jan 14, 2025
4c6aa79
Update documentation/functions.md
knassre-bodo Jan 14, 2025
e60dcc3
Update documentation/functions.md
knassre-bodo Jan 14, 2025
a23c395
Update documentation/functions.md
knassre-bodo Jan 14, 2025
2515683
Update documentation/functions.md
knassre-bodo Jan 14, 2025
d34c64c
Update documentation/functions.md
knassre-bodo Jan 14, 2025
cd59dc2
Update documentation/functions.md
knassre-bodo Jan 14, 2025
92778e3
Update documentation/functions.md
knassre-bodo Jan 14, 2025
6003ed0
Update documentation/functions.md
knassre-bodo Jan 14, 2025
fb86e3a
Update documentation/functions.md
knassre-bodo Jan 14, 2025
ca6b653
Updating arithmetic documentaiton and LIKE link
knassre-bodo Jan 14, 2025
22dfb32
Updating numerical operator warning
knassre-bodo Jan 14, 2025
e550ae4
Added function list checking test and 3 missing functions
knassre-bodo Jan 14, 2025
f7af1a9
Updated some explanations
knassre-bodo Jan 14, 2025
3e0b62c
Merge branch 'kian/function_docs' into kian/dsl_docs
knassre-bodo Jan 14, 2025
6a66c5f
Started PARTITION docs, still need to add a few more bad examples
knassre-bodo Jan 14, 2025
19cfd47
Added expressions, more bad partition examples, and NEXT/PREV
knassre-bodo Jan 14, 2025
acba2c6
Started BEST documentation, still need to do bad next/prev/best examples
knassre-bodo Jan 14, 2025
3f0b703
[RUN CI]
knassre-bodo Jan 16, 2025
fb25054
Merge branch 'kian/function_docs' into kian/dsl_docs
knassre-bodo Jan 16, 2025
eb81179
Merge branch 'main' into kian/dsl_docs
knassre-bodo Jan 17, 2025
0f5df29
Added bad next/prev examples
knassre-bodo Jan 17, 2025
9752b92
Added bad next/prev examples
knassre-bodo Jan 17, 2025
223c1b5
Adding examples and fixing 911 bugs to AST/hybrid handling [RUN CI]
knassre-bodo Jan 21, 2025
7b142eb
Updated TOC
knassre-bodo Jan 21, 2025
947d405
Merge branch 'main' into kian/dsl_docs
knassre-bodo Jan 21, 2025
df7e1b0
Adding 911 bugfix for partition and corresponding tests [RUN CI]
knassre-bodo Jan 21, 2025
3f167a4
Fixing mkglot examples [RUN CI]
knassre-bodo Jan 21, 2025
09f2bfe
Added extra DSL example comments
knassre-bodo Jan 21, 2025
30536e8
Added extra DSL example comments
knassre-bodo Jan 21, 2025
261f869
Updating alias counter
knassre-bodo Jan 22, 2025
d5b315e
Merge branch 'kian/dsl_docs' into kian/fix_extra_joins
knassre-bodo Jan 22, 2025
3794707
Adding triple partition test
knassre-bodo Jan 22, 2025
c04f34d
Adjiusting triple_partition test [RUN CI]
knassre-bodo Jan 22, 2025
39ce2ac
Fixing unit test [RUN CI]
knassre-bodo Jan 22, 2025
28065c0
Update documentation/dsl.md
knassre-bodo Jan 22, 2025
d88192f
Update documentation/dsl.md
knassre-bodo Jan 22, 2025
fed91e5
Added extra WHERE examples
knassre-bodo Jan 23, 2025
7d9492d
Revisions
knassre-bodo Jan 23, 2025
2ef58e3
Update documentation/dsl.md
knassre-bodo Jan 23, 2025
b904f3a
Update documentation/dsl.md
knassre-bodo Jan 23, 2025
7f69536
Updating capitalization
knassre-bodo Jan 23, 2025
fb19842
Resolving conflicts
knassre-bodo Jan 23, 2025
d1ffdb5
Apply suggestions from code review
knassre-bodo Jan 23, 2025
91e2a74
Extra revisions
knassre-bodo Jan 23, 2025
8b9db1e
More plural fixes
knassre-bodo Jan 23, 2025
2c29a3d
Merge branch 'kian/dsl_docs' into kian/fix_extra_joins
knassre-bodo Jan 23, 2025
e8b0cdb
Refactor agg call handling
knassre-bodo Jan 23, 2025
a5d8c1f
WIP progress on correlated references
knassre-bodo Jan 24, 2025
836adca
Initially working implementaiton of relational handling, still need t…
knassre-bodo Jan 27, 2025
9c72887
Merge branch 'main' into kian/corrleated_backref
knassre-bodo Jan 27, 2025
49f46d8
Rolling back SQLGlot changes and ensuring correl names are only used …
knassre-bodo Jan 27, 2025
81d130e
Added SQLGlot support for correlated references; working only with ex…
knassre-bodo Jan 27, 2025
60fea67
Adding additional tests, have a plan for how to deal with the 4 major…
knassre-bodo Jan 27, 2025
9980a2c
Pulling down changes
knassre-bodo Jan 28, 2025
4c54d57
Pulling out SQLGlot changes into followup PR
knassre-bodo Jan 28, 2025
7a3fd30
Resolving conflicts
knassre-bodo Jan 28, 2025
b519cd0
Added two more correlated backref edge cases
knassre-bodo Jan 30, 2025
4033936
Fixing backref name conflict bug
knassre-bodo Jan 31, 2025
c2bbb59
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Jan 31, 2025
8fcc609
Confirmed all correl queries except 1, 2, 3, 6, 8 and 9 are working
knassre-bodo Jan 31, 2025
8a8e49f
Added multi-correlate example
knassre-bodo Jan 31, 2025
4eeadd9
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Jan 31, 2025
9427cce
Finished adding/refining complex correlation tests
knassre-bodo Feb 3, 2025
397f40d
Pulling up testing changes
knassre-bodo Feb 3, 2025
71ca0e2
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 3, 2025
719eec5
Adding two more correl tests
knassre-bodo Feb 3, 2025
2289aaf
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 3, 2025
52c2af5
Fixing correl 16
knassre-bodo Feb 3, 2025
19b3628
Fixing correlated test #16
knassre-bodo Feb 3, 2025
600616e
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 3, 2025
b6c9b85
WIP handling renamings
knassre-bodo Feb 3, 2025
afd241b
Converted qdag conversion tests to be plan file based
knassre-bodo Feb 3, 2025
ceff105
Converting pipeline files to new format
knassre-bodo Feb 3, 2025
64311bc
[RUN CI]
knassre-bodo Feb 3, 2025
26b21a2
Mass updating using PYDOUGH_UPDATE_TESTS
knassre-bodo Feb 3, 2025
8d6878f
Adding comments [RUN CI]
knassre-bodo Feb 3, 2025
06dd69f
Porting correl tests to planner files
knassre-bodo Feb 3, 2025
6949cdd
Revisions [RUN CI]
knassre-bodo Feb 4, 2025
7afe8ad
Resolving conflicts
knassre-bodo Feb 4, 2025
8184a7e
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 4, 2025
318eb1b
Resolving conflicts
knassre-bodo Feb 6, 2025
45fea5b
Resolving conflicts
knassre-bodo Feb 6, 2025
7bb353f
Resolving conflicts
knassre-bodo Feb 6, 2025
0cf4c11
Adding documentation
knassre-bodo Feb 6, 2025
3a1516e
Adding more comments
knassre-bodo Feb 6, 2025
746eed5
Added more comments [RUN CI]
knassre-bodo Feb 6, 2025
4818c59
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 6, 2025
17116d4
Initial handling of decorrelation setup
knassre-bodo Feb 6, 2025
9ab9bed
Adding decorrelater file
knassre-bodo Feb 6, 2025
a8f6535
Renaming and added comment
knassre-bodo Feb 6, 2025
cf8cc50
Merge branch 'main' into kian/corrleated_backref
knassre-bodo Feb 6, 2025
7556b5e
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 6, 2025
37ebdb4
Merge branch 'kian/correlated_backref_2' into kian/correlated_backref_3
knassre-bodo Feb 6, 2025
8ff5157
Implented singular decorrelation handling
knassre-bodo Feb 7, 2025
9191ec4
Fixing aggregation for singular case
knassre-bodo Feb 7, 2025
9bc3564
WIP handling edge cases and aggregation; need to fix tpch q5/q22, and…
knassre-bodo Feb 7, 2025
67a342e
Updating plans for newly working correl queries 6/9/17
knassre-bodo Feb 7, 2025
bb24b30
Bugfixes to decorrelation, compressing helper functions
knassre-bodo Feb 9, 2025
8f83903
Filling vlaue for correl tests 1/2/3
knassre-bodo Feb 9, 2025
b096443
Pulling up testing completion changes
knassre-bodo Feb 9, 2025
e9f2606
Updating refsols
knassre-bodo Feb 9, 2025
9772f61
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 9, 2025
ec93596
Resolving conflicts
knassre-bodo Feb 9, 2025
0a00cea
Resolving issues for correl #3
knassre-bodo Feb 10, 2025
d1e1520
Fixing correl #3 test output
knassre-bodo Feb 10, 2025
fd37df1
Moving correlated tests to their own file
knassre-bodo Feb 10, 2025
a75bba5
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 10, 2025
8d75668
Merge branch 'kian/correlated_backref_2' into kian/correlated_backref_3
knassre-bodo Feb 10, 2025
2cfe999
Added documentation
knassre-bodo Feb 10, 2025
5a55ac5
Pulling up downstream test changes
knassre-bodo Feb 10, 2025
ad61d27
Pulling up downstream test changes
knassre-bodo Feb 10, 2025
07d771d
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 10, 2025
ba095ec
Resolving conflicts
knassre-bodo Feb 10, 2025
7ef0476
Cleanup
knassre-bodo Feb 10, 2025
c83b67e
Removing dead code
knassre-bodo Feb 10, 2025
fdbf909
Revisions [RUN CI]
knassre-bodo Feb 17, 2025
ec77c0d
Merge branch 'kian/correlation_omnibus' into kian/corrleated_backref
knassre-bodo Feb 17, 2025
a5531e3
Revisions
knassre-bodo Feb 17, 2025
4becac1
Merge branch 'kian/corrleated_backref' into kian/correlated_backref_2
knassre-bodo Feb 17, 2025
c4af826
Resolve conflicts
knassre-bodo Feb 17, 2025
f44ba0e
Merge branch 'kian/correlated_backref_2' into kian/correlated_backref_3
knassre-bodo Feb 17, 2025
11ac86d
Merge branch 'kian/correlation_omnibus' into kian/correlated_backref_3
knassre-bodo Feb 17, 2025
2105bc8
Fixing typos
knassre-bodo Feb 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
330 changes: 330 additions & 0 deletions pydough/conversion/hybrid_decorrelater.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,330 @@
"""
Logic for applying de-correlation to hybrid trees before relational conversion
if the correlate is not a semi/anti join.
"""

__all__ = ["run_hybrid_decorrelation"]


import copy

from .hybrid_tree import (
ConnectionType,
HybridBackRefExpr,
HybridCalc,
HybridChildRefExpr,
HybridColumnExpr,
HybridConnection,
HybridCorrelExpr,
HybridExpr,
HybridFilter,
HybridFunctionExpr,
HybridLiteralExpr,
HybridPartition,
HybridRefExpr,
HybridTree,
HybridWindowExpr,
)


class Decorrelater:
"""
Class that encapsulates the logic used for de-correlation of hybrid trees.
"""

def make_decorrelate_parent(
self, hybrid: HybridTree, child_idx: int, required_steps: int
) -> HybridTree:
"""
Creates a snapshot of the ancestry of the hybrid tree that contains
a correlated child, without any of its children, its descendants, or
any pipeline operators that do not need to be there.

Args:
`hybrid`: The hybrid tree to create a snapshot of in order to aid
in the de-correlation of a correlated child.
`child_idx`: The index of the correlated child of hybrid that the
snapshot is being created to aid in the de-correlation of.
`required_steps`: The index of the last pipeline operator that
needs to be included in the snapshot in order for the child to be
derivable.

Returns:
A snapshot of `hybrid` and its ancestry in the hybrid tree, without
without any of its children or pipeline operators that occur during
or after the derivation of the correlated child, or without any of
its descendants.
"""
if isinstance(hybrid.pipeline[0], HybridPartition) and child_idx == 0:
# Special case: if the correlated child is the data argument of a
# partition operation, then the parent to snapshot is actually the
# parent of the level containing the partition operation. In this
# case, all of the parent's children & pipeline operators should be
# included in the snapshot.
assert hybrid.parent is not None
return self.make_decorrelate_parent(
hybrid.parent, len(hybrid.parent.children), len(hybrid.pipeline)
)
# Temporarily detach the successor of the current level, then create a
# deep copy of the current level (which will include its ancestors),
# then reattach the successor back to the original. This ensures that
# the descendants of the current level are not included when providing
# the parent to the correlated child as its new ancestor.
successor: HybridTree | None = hybrid.successor
hybrid._successor = None
new_hybrid: HybridTree = copy.deepcopy(hybrid)
hybrid._successor = successor
# Ensure the new parent only includes the children & pipeline operators
# that is has to.
new_hybrid._children = new_hybrid._children[:child_idx]
new_hybrid._pipeline = new_hybrid._pipeline[: required_steps + 1]
return new_hybrid

def remove_correl_refs(
self, expr: HybridExpr, parent: HybridTree, child_height: int
) -> HybridExpr:
"""
Recursively & destructively removes correlated references within a
hybrid expression if they point to a specific correlated ancestor
hybrid tree, and replaces them with corresponding BACK references.

Args:
`expr`: The hybrid expression to remove correlated references from.
`parent`: The correlated ancestor hybrid tree that the correlated
references should point to when they are targeted for removal.
`child_height`: The height of the correlated child within the
hybrid tree that the correlated references is point to. This is
the number of BACK indices to shift by when replacing the
correlated reference with a BACK reference.

Returns:
The hybrid expression with all correlated references to `parent`
replaced with corresponding BACK references. The replacement also
happens in-place.
"""
match expr:
case HybridCorrelExpr():
# If the correlated reference points to the parent, then
# replace it with a BACK reference. Otherwise, recursively
# transform its input expression in case it contains another
# correlated reference.
if expr.hybrid is parent:
result: HybridExpr | None = expr.expr.shift_back(child_height)
assert result is not None
return result
else:
expr.expr = self.remove_correl_refs(expr.expr, parent, child_height)
return expr
case HybridFunctionExpr():
# For regular functions, recursively transform all of their
# arguments.
for idx, arg in enumerate(expr.args):
expr.args[idx] = self.remove_correl_refs(arg, parent, child_height)
return expr
case HybridWindowExpr():
# For window functions, recursively transform all of their
# arguments, partition keys, and order keys.
for idx, arg in enumerate(expr.args):
expr.args[idx] = self.remove_correl_refs(arg, parent, child_height)
for idx, arg in enumerate(expr.partition_args):
expr.partition_args[idx] = self.remove_correl_refs(
arg, parent, child_height
)
for order_arg in expr.order_args:
order_arg.expr = self.remove_correl_refs(
order_arg.expr, parent, child_height
)
return expr
case (
HybridBackRefExpr()
| HybridRefExpr()
| HybridChildRefExpr()
| HybridLiteralExpr()
| HybridColumnExpr()
):
# All other expression types do not require any transformation
# to de-correlate since they cannot contain correlations.
return expr
case _:
raise NotImplementedError(
f"Unsupported expression type: {expr.__class__.__name__}."
)

def correl_ref_purge(
self,
level: HybridTree | None,
old_parent: HybridTree,
new_parent: HybridTree,
child_height: int,
) -> None:
"""
The recursive procedure to remove correlated references from the
expressions of a hybrid tree or any of its ancestors or children if
they refer to a specific correlated ancestor that is being removed.

Args:
`level`: The current level of the hybrid tree to remove correlated
references from.
`old_parent`: The correlated ancestor hybrid tree that the correlated
references should point to when they are targeted for removal.
`new_parent`: The ancestor of `level` that removal should stop at
because it is the transposed snapshot of `old_parent`, and
therefore it & its ancestors cannot contain any more correlated
references that would be targeted for removal.
`child_height`: The height of the correlated child within the
hybrid tree that the correlated references is point to. This is
the number of BACK indices to shift by when replacing the
correlated reference with a BACK
"""
while level is not None and level is not new_parent:
# First, recursively remove any targeted correlated references from
# the children of the current level.
for child in level.children:
self.correl_ref_purge(
child.subtree, old_parent, new_parent, child_height
)
# Then, remove any correlated references from the pipeline
# operators of the current level. Usually this just means
# transforming the terms/orderings/unique keys of the operation,
# but specific operation types will require special casing if they
# have additional expressions stored in other field that need to be
# transformed.
for operation in level.pipeline:
for name, expr in operation.terms.items():
operation.terms[name] = self.remove_correl_refs(
expr, old_parent, child_height
)
for ordering in operation.orderings:
ordering.expr = self.remove_correl_refs(
ordering.expr, old_parent, child_height
)
for idx, expr in enumerate(operation.unique_exprs):
operation.unique_exprs[idx] = self.remove_correl_refs(
expr, old_parent, child_height
)
if isinstance(operation, HybridCalc):
for str, expr in operation.new_expressions.items():
operation.new_expressions[str] = self.remove_correl_refs(
expr, old_parent, child_height
)
if isinstance(operation, HybridFilter):
operation.condition = self.remove_correl_refs(
operation.condition, old_parent, child_height
)
# Repeat the process on the ancestor until either loop guard
# condition is no longer True.
level = level.parent

def decorrelate_child(
self,
old_parent: HybridTree,
new_parent: HybridTree,
child: HybridConnection,
is_aggregate: bool,
) -> None:
"""
Runs the logic to de-correlate a child of a hybrid tree that contains
a correlated reference. This involves linking the child to a new parent
as its ancestor, the parent being a snapshot of the original hybrid
tree that contained the correlated child as a child. The transformed
child can now replace correlated references with BACK references that
point to terms in its newly expanded ancestry, and the original hybrid
tree cna now join onto this child using its uniqueness keys.
"""
# First, find the height of the child subtree & its top-most level.
child_root: HybridTree = child.subtree
child_height: int = 1
while child_root.parent is not None:
child_height += 1
child_root = child_root.parent
# Link the top level of the child subtree to the new parent.
new_parent.add_successor(child_root)
# Replace any correlated references to the original parent with BACK references.
self.correl_ref_purge(child.subtree, old_parent, new_parent, child_height)
# Update the join keys to join on the unique keys of all the ancestors.
new_join_keys: list[tuple[HybridExpr, HybridExpr]] = []
additional_levels: int = 0
current_level: HybridTree | None = old_parent
while current_level is not None:
for unique_key in current_level.pipeline[0].unique_exprs:
lhs_key: HybridExpr | None = unique_key.shift_back(additional_levels)
rhs_key: HybridExpr | None = unique_key.shift_back(
additional_levels + child_height
)
assert lhs_key is not None and rhs_key is not None
new_join_keys.append((lhs_key, rhs_key))
current_level = current_level.parent
additional_levels += 1
child.subtree.join_keys = new_join_keys
# If aggregating, do the same with the aggregation keys.
if is_aggregate:
new_agg_keys: list[HybridExpr] = []
assert child.subtree.join_keys is not None
for _, rhs_key in child.subtree.join_keys:
new_agg_keys.append(rhs_key)
child.subtree.agg_keys = new_agg_keys

def decorrelate_hybrid_tree(self, hybrid: HybridTree) -> HybridTree:
"""
TODO
"""
# Recursively decorrelate the ancestors of the current level of the
# hybrid tree.
if hybrid.parent is not None:
hybrid._parent = self.decorrelate_hybrid_tree(hybrid.parent)
hybrid._parent._successor = hybrid
# Iterate across all the children and recursively decorrelate them.
for child in hybrid.children:
child.subtree = self.decorrelate_hybrid_tree(child.subtree)
# Iterate across all the children, identify any that are correlated,
# and transform any of the correlated ones that require decorrelation
# due to the type of connection.
for idx, child in enumerate(hybrid.children):
if idx not in hybrid.correlated_children:
continue
new_parent: HybridTree = self.make_decorrelate_parent(
hybrid, idx, hybrid.children[idx].required_steps
)
match child.connection_type:
case (
ConnectionType.SINGULAR
| ConnectionType.SINGULAR_ONLY_MATCH
| ConnectionType.AGGREGATION
| ConnectionType.AGGREGATION_ONLY_MATCH
):
self.decorrelate_child(
hybrid, new_parent, child, child.connection_type.is_aggregation
)
case ConnectionType.NDISTINCT | ConnectionType.NDISTINCT_ONLY_MATCH:
raise NotImplementedError(
f"PyDough does not yet support correlated references with the {child.connection_type.name} pattern."
)
case (
ConnectionType.SEMI
| ConnectionType.ANTI
| ConnectionType.NO_MATCH_SINGULAR
| ConnectionType.NO_MATCH_AGGREGATION
| ConnectionType.NO_MATCH_NDISTINCT
):
# These patterns do not require decorrelation since they
# are supported via correlated SEMI/ANTI joins.
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a default "case _:" here?

Copy link
Contributor Author

@knassre-bodo knassre-bodo Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because these are exhaustive with regards to the enum. If any new case is generated, mypy will throw an exception because these branches are no longer exhaustive. So here, we can rely on mypy tooling to ensure quality.

return hybrid


def run_hybrid_decorrelation(hybrid: HybridTree) -> HybridTree:
"""
Invokes the procedure to remove correlated references from a hybrid tree
before relational conversion if those correlated references are invalid
(e.g. not from a semi/anti join).

Args:
`hybrid`: The hybrid tree to remove correlated references from.

Returns:
The hybrid tree with all invalid correlated references removed as the
tree structure is re-written to allow them to be replaced with BACK
references. The transformation is also done in-place.
"""
decorr: Decorrelater = Decorrelater()
return decorr.decorrelate_hybrid_tree(hybrid)
Loading