Skip to content

Re-enabled null-equal join dynamic filters with an IS NULL predicate.#3

Draft
mdashti wants to merge 6 commits into
mainfrom
moe/null-equal-dynamic-filter
Draft

Re-enabled null-equal join dynamic filters with an IS NULL predicate.#3
mdashti wants to merge 6 commits into
mainfrom
moe/null-equal-dynamic-filter

Conversation

@mdashti

@mdashti mdashti commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Runs the fork's CI for the upstream PR apache#23106. Same branch and commits (the apache#23104 probe-NULL helper plus the null-equal change); opened against the fork's main so the full suite runs here too.

Re-enables hash-join dynamic filter pushdown for null-equal joins (reverting apache#22965's return false) by pushing the filter with OR key IS NULL over every nullable probe key. A NOT NULL key never widens it. Tests: apache#22965's SLT now shows the filter back on the probe with the result unchanged, plus a multi-key case; the reject unit test flips to assert-allowed; and preserve_probe_nulls unit tests cover the nullable-vs-NOT NULL paths.

mdashti added 2 commits June 22, 2026 18:29
The hash-join dynamic filter pushed `key IN build_keys` down to the probe
scan for null-aware anti joins too. That drops the probe-side NULL, but
`NOT IN` three-valued logic needs it to collapse the result to zero rows,
so the join silently returned rows.

OR `probe_key IS NULL` into the pushed predicate. Non-NULL probe rows
still get filtered; only the NULL additionally survives.
Exercises the pushdown path the existing in-memory tests miss: parquet with
row-level filtering, so the pushed dynamic filter actually drops rows. Without
the fix `id NOT IN (SELECT eid ...)` returns 1 and 3 instead of zero rows.
@mdashti mdashti force-pushed the moe/null-equal-dynamic-filter branch from 3721aa9 to 9f4e40c Compare June 23, 2026 04:23
@mdashti mdashti marked this pull request as draft June 24, 2026 19:53
mdashti added 4 commits June 24, 2026 13:09
It's cheap, so it short-circuits NULL rows before the costlier filter.
The `debug_assert` pins the single-key invariant the `on_right[0]`
indexing relies on.
RESET restores the defaults at the end instead of re-setting explicit
values. A probe can hold several NULLs, so the comments read as plural.
build-side predicate prunes a probe-side NULL that can null-match a build-side
NULL. Push the filter with `OR key IS NULL` over the nullable probe keys
instead, the way apache#23104 does for null-aware anti joins. A NOT NULL key never
widens the filter, so an all-NOT-NULL join keeps full selectivity.
The `unwrap_or(true)` widening on an unresolved nullability check wasn't
obvious. An extra NULL row is safe; dropping a needed one isn't.
@mdashti mdashti force-pushed the moe/null-equal-dynamic-filter branch from 9f4e40c to 9620b97 Compare June 24, 2026 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant