Skip to content

Commit 0d8411f

Browse files
authored
Consolidate loops in SqlMapRequiredColumnAliasesVisitor (#1678)
This PR consolidates a couple of loops in `SqlMapRequiredColumnAliasesVisitor` and replaces `.add_alias()` calls with `.add_aliases()` to reduce nesting. There are no logic changes in this PR, and this is mainly to make #1679 a simple change given the complexity in that class. There is some additional consolidation that can be done with other loops, but leaving those out of scope.
1 parent 73c2e42 commit 0d8411f

File tree

1 file changed

+15
-17
lines changed

1 file changed

+15
-17
lines changed

metricflow/sql/optimizer/column_pruning/required_column_aliases.py

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -234,26 +234,24 @@ def visit_select_statement_node(self, node: SqlSelectStatementNode) -> None:
234234
column_aliases=aliases_required_in_parent,
235235
)
236236

237-
# For all string columns, assume that they are needed from all sources since we don't have a table alias
238-
# in SqlStringExpression.used_columns
237+
# Find instances where a column alias is referenced without a table alias. The two cases are:
238+
# * String expressions like`col_0 + col_1` where a string is used instead of the corresponding SQL object.
239+
# * `SqlColumnAliasReferenceExpression` - e.g. `SELECT col_0` instead of `SELECT table_0.col_0`
240+
column_aliases_to_retain: Set[str] = set()
239241
for string_expr in exprs_used_in_this_node.string_exprs:
240242
if string_expr.used_columns:
241-
for column_alias in string_expr.used_columns:
242-
for node_to_retain_columns in (node.from_source,) + tuple(
243-
join_desc.right_source for join_desc in node.join_descs
244-
):
245-
self._current_required_column_alias_mapping.add_alias(node_to_retain_columns, column_alias)
246-
247-
# Same with unqualified column references - it's hard to tell which source it came from, so it's safest to say
248-
# it's required from all parents.
249-
# An unqualified column reference expression is like `SELECT col_0` whereas a qualified column reference
250-
# expression is like `SELECT table_0.col_0`.
243+
column_aliases_to_retain.update(column_alias for column_alias in string_expr.used_columns)
244+
251245
for unqualified_column_reference_expr in exprs_used_in_this_node.column_alias_reference_exprs:
252-
column_alias = unqualified_column_reference_expr.column_alias
253-
for node_to_retain_columns in (node.from_source,) + tuple(
254-
join_desc.right_source for join_desc in node.join_descs
255-
):
256-
self._current_required_column_alias_mapping.add_alias(node_to_retain_columns, column_alias)
246+
column_aliases_to_retain.add(unqualified_column_reference_expr.column_alias)
247+
248+
# Assume those column aliases are needed from all sources as it may not be possible to know which source it
249+
# comes from based on the SQL (e.g. if a query reads from two tables, you would need to know the table schema
250+
# to know which table the column resides)
251+
for node_to_retain_columns in (node.from_source,) + tuple(
252+
join_desc.right_source for join_desc in node.join_descs
253+
):
254+
self._current_required_column_alias_mapping.add_aliases(node_to_retain_columns, column_aliases_to_retain)
257255

258256
# Visit recursively.
259257
self._visit_parents(node)

0 commit comments

Comments
 (0)