[Improvement](scan) support push down limit to segment iterator#62222
[Improvement](scan) support push down limit to segment iterator#62222BiteTheDDDDt wants to merge 17 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Pull request overview
This PR introduces a unified LIMIT pushdown mechanism that lets the storage layer (SegmentIterator) dynamically cap per-batch reads using both a local per-segment limit and a shared cross-scanner remaining-row budget.
Changes:
- Added a unified
StorageReadOptions::ReadLimit(local + shared remaining) and wired it through TabletReader/RowsetReader to SegmentIterator. - Updated scan execution to coordinate a shared scan limit across scanners (early-exit + decrement on produced blocks) and removed scheduler-side quota truncation.
- Added a new regression test suite and a BE unit test for SegmentIterator limit optimization eligibility.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/query_p0/limit/test_unified_limit_pushdown.groovy | New regression coverage for unified limit pushdown scenarios (inverted index + multi-bucket + offset/order-by). |
| be/test/storage/segment/segment_iterator_limit_opt_test.cpp | New unit test targeting _can_opt_limit_reads() logic. |
| be/src/storage/tablet/tablet_reader.h / .cpp | Propagates shared_scan_limit into storage reader context. |
| be/src/storage/rowset/rowset_reader_context.h | Carries shared_scan_limit down to rowset readers. |
| be/src/storage/rowset/beta_rowset_reader.cpp | Maps topn/general limits + shared remaining into StorageReadOptions.read_limit. |
| be/src/storage/iterators.h | Replaces topn_limit with unified read_limit in StorageReadOptions. |
| be/src/storage/segment/segment_iterator.h / .cpp | Renames/extends limit optimization check and applies unified limit cap/exhaustion behavior. |
| be/src/storage/iterator/vcollect_iterator.cpp | Propagates general limit into rowset reader limit path for SegmentIterator I/O reduction. |
| be/src/exec/operator/scan_operator.h / .cpp | Exposes shared_scan_limit_ptr() to scanners. |
| be/src/exec/scan/scanner.cpp | Early-exits and decrements shared limit to coordinate scanners. |
| be/src/exec/scan/scanner_scheduler.cpp | Removes scheduler-side shared limit quota truncate/discard logic. |
| be/src/exec/scan/scanner_context.cpp | Removes “finish when shared limit exhausted” logic and pending-task short-circuit. |
| be/src/exec/scan/olap_scanner.cpp | Propagates shared scan limit pointer to TabletReader params (no-merge path). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Early exit when the global shared scan limit is exhausted. | ||
| // This avoids unnecessary I/O when other scanners have already | ||
| // collected enough rows for the SQL LIMIT. | ||
| auto* shared_limit = _local_state->shared_scan_limit_ptr(); | ||
| if (shared_limit && shared_limit->load(std::memory_order_acquire) <= 0) { | ||
| *eof = true; | ||
| return Status::OK(); | ||
| } |
There was a problem hiding this comment.
shared_scan_limit uses -1 as the sentinel for “no limit” (see ScanOperatorX::_shared_scan_limit docs), but this early-exit treats any value <= 0 as exhausted. If _shared_scan_limit is -1, scanners will immediately return EOF and queries without a SQL LIMIT can return zero rows. Please change the checks to treat negative values as “unlimited” (do not early-exit) and only stop when the counter is actually 0.
| // satisfied. The counter may go negative when multiple scanners | ||
| // subtract concurrently; this is harmless because the operator's | ||
| // reached_limit() provides the authoritative truncation. | ||
| if (shared_limit && block->rows() > 0) { | ||
| shared_limit->fetch_sub(block->rows(), std::memory_order_acq_rel); |
There was a problem hiding this comment.
Using fetch_sub() here can drive the shared limit counter negative (e.g. concurrent scanners), which collides with the existing “-1 means no limit” sentinel and makes later <= 0 checks ambiguous. Consider switching to a CAS-based decrement that clamps at 0 for limited scans (and skips decrement entirely when the counter is negative/unlimited), or use the existing ScannerContext::acquire_limit_quota() logic to preserve the counter’s semantics.
| // satisfied. The counter may go negative when multiple scanners | |
| // subtract concurrently; this is harmless because the operator's | |
| // reached_limit() provides the authoritative truncation. | |
| if (shared_limit && block->rows() > 0) { | |
| shared_limit->fetch_sub(block->rows(), std::memory_order_acq_rel); | |
| // satisfied. Preserve negative values as the existing "no limit" | |
| // sentinel, and clamp limited scans at 0 so concurrent scanners | |
| // cannot drive the counter into the sentinel range. | |
| if (shared_limit && block->rows() > 0) { | |
| int64_t current = shared_limit->load(std::memory_order_acquire); | |
| while (current > 0) { | |
| const int64_t next = std::max<int64_t>(0, current - block->rows()); | |
| if (shared_limit->compare_exchange_weak(current, next, | |
| std::memory_order_acq_rel, | |
| std::memory_order_acquire)) { | |
| break; | |
| } | |
| } |
| *eof = *eof || (_limit > 0 && _num_rows_return >= _limit); | ||
| // Also stop when the global shared scan limit is exhausted. | ||
| *eof = *eof || (shared_limit && shared_limit->load(std::memory_order_acquire) <= 0); |
There was a problem hiding this comment.
This EOF condition also uses <= 0 on the shared limit. With “-1 means no limit”, this will force EOF for unlimited scans. After fixing the counter semantics, this check should only stop when the shared limit is actually exhausted (typically == 0 for the limited case).
| struct ReadLimit { | ||
| size_t local_limit = 0; | ||
| std::atomic<int64_t>* global_remaining = nullptr; | ||
|
|
||
| bool active() const { return local_limit > 0 || global_remaining != nullptr; } | ||
|
|
||
| // Cap max_rows by the effective limit. | ||
| uint32_t cap(uint32_t max_rows) const { | ||
| uint32_t limit = max_rows; | ||
| if (local_limit > 0 && local_limit < limit) { | ||
| limit = static_cast<uint32_t>(local_limit); | ||
| } | ||
| if (global_remaining != nullptr) { | ||
| int64_t remaining = | ||
| global_remaining->load(std::memory_order_acquire); | ||
| if (remaining >= 0 && | ||
| remaining < static_cast<int64_t>(limit)) { | ||
| limit = static_cast<uint32_t>(remaining); | ||
| } | ||
| } | ||
| return limit; | ||
| } | ||
|
|
||
| // True when the global budget is fully exhausted (0 or negative). | ||
| bool exhausted() const { | ||
| return global_remaining != nullptr && | ||
| global_remaining->load(std::memory_order_acquire) <= 0; | ||
| } |
There was a problem hiding this comment.
StorageReadOptions now exposes std::atomic and std::memory_order_* in this header, but <atomic> isn’t included here. Also, ReadLimit::exhausted() currently treats any value <= 0 as exhausted; that conflicts with the established “-1 means no limit” semantics and can cause readers to stop immediately when the pointer is set but unlimited. Please include <atomic> and make exhausted() consistent with the sentinel/unlimited behavior (and ideally keep the counter from going negative for the limited case).
| } else if (_opts.read_limit.exhausted()) { | ||
| // Even without index optimization eligibility, stop reading when | ||
| // the global limit is fully exhausted — no more rows needed. |
There was a problem hiding this comment.
With the new read_limit.exhausted() branch, nrows_read_limit can be forced to 0 even when the scan is “unlimited” if the shared counter uses the -1 sentinel and exhausted() treats <= 0 as exhausted. Please ensure the shared-limit semantics are consistent end-to-end (negative = unlimited), otherwise SegmentIterator may stop reading rows unexpectedly.
| } else if (_opts.read_limit.exhausted()) { | |
| // Even without index optimization eligibility, stop reading when | |
| // the global limit is fully exhausted — no more rows needed. | |
| } else if (_opts.read_limit.local_limit >= 0 && _opts.read_limit.exhausted()) { | |
| // Even without index optimization eligibility, stop reading when | |
| // the global limit is fully exhausted — no more rows needed. | |
| // Negative limits are treated as unlimited and must not force reads to stop. |
|
|
||
| template <typename Derived> | ||
| std::atomic<int64_t>* ScanLocalState<Derived>::shared_scan_limit_ptr() { | ||
| return &_parent->cast<typename Derived::Parent>()._shared_scan_limit; |
There was a problem hiding this comment.
shared_scan_limit_ptr() currently returns a non-null pointer unconditionally. Since _shared_scan_limit is documented as “-1 means no limit”, returning a pointer when the value is negative causes downstream code to misinterpret “no limit” as “limit exhausted” (see Scanner/SegmentIterator checks). Consider returning nullptr when _shared_scan_limit is negative (or otherwise exposing an API that makes the “unlimited” case unambiguous).
| return &_parent->cast<typename Derived::Parent>()._shared_scan_limit; | |
| auto& shared_scan_limit = _parent->cast<typename Derived::Parent>()._shared_scan_limit; | |
| return shared_scan_limit.load() < 0 ? nullptr : &shared_scan_limit; |
| // Pointer to the shared remaining-row budget maintained by ScanOperatorX. | ||
| // Decremented by ScannerScheduler after each block is produced. | ||
| // Storage layer uses it as a dynamic hint to reduce I/O. | ||
| // nullptr means no dynamic limit. -1 means no limit. | ||
| std::atomic<int64_t>* shared_scan_limit = nullptr; |
There was a problem hiding this comment.
This new field introduces std::atomic<int64_t> in a public header, but the header doesn’t include <atomic>. Also, the comment says the budget is “Decremented by ScannerScheduler”, but the decrement was moved into Scanner::get_block() in this PR; please update the comment so it reflects the current ownership/location of the decrement logic.
| // Dynamic shared limit pointer from ScanOperatorX. | ||
| // See ReaderParams.shared_scan_limit for details. | ||
| std::atomic<int64_t>* shared_scan_limit = nullptr; |
There was a problem hiding this comment.
This struct now uses std::atomic<int64_t> but the header does not include <atomic>. Please add the include to keep the header self-contained (and avoid relying on transitive includes).
| // Build a SegmentIterator with minimal opts for _can_opt_limit_reads() testing. | ||
| // The segment pointer is null — only _opts and internal maps are accessed. | ||
| std::unique_ptr<SegmentIterator> make_iter() { | ||
| auto iter = std::make_unique<SegmentIterator>(nullptr, _read_schema); | ||
| iter->_opts.tablet_schema = _tablet_schema; | ||
| iter->_opts.stats = &_stats; | ||
| // delete_condition_predicates is default-constructed (empty) | ||
| return iter; | ||
| } |
There was a problem hiding this comment.
This test directly accesses SegmentIterator private members (_opts, _column_predicate_index_exec_status) and calls the private method _can_opt_limit_reads(). As written it won’t compile unless you use the existing test pattern (e.g. #define private public before including the header, or declaring the test as a friend), or refactor to test via public APIs.
|
/review |
There was a problem hiding this comment.
I found 1 issue in this PR.
Critical checkpoint conclusions:
- Goal of the task: The PR aims to push LIMIT into SegmentIterator for lower I/O. The runtime path is implemented end to end, and regression coverage was added, but the new BE unit test currently prevents the test target from compiling, so the task is not complete yet.
- Scope/minimality: The production change is reasonably focused on scan/storage limit propagation.
- Concurrency: Applicable. I checked the new shared-limit flow across
ScanOperator,Scanner,ScannerContext, andSegmentIterator; the atomic accesses use acquire/release semantics and I did not find a confirmed lock-order or deadlock regression in the reviewed paths. - Lifecycle/static initialization: No special lifecycle or static-init issue found in the reviewed change.
- Configuration: No new configuration items.
- Compatibility: No incompatible FE/BE protocol or storage-format change identified.
- Parallel code paths: Applicable. The change updates both the general-limit and order-by/topn-related storage-limit path; I did not find a second missed production path in the reviewed files.
- Special conditional checks: The new limit checks are commented and understandable.
- Test coverage: Applicable. Regression coverage and a BE UT were added, but the new UT is not currently buildable, so the added coverage is not runnable as submitted.
- Observability: No additional observability looked necessary for this optimization.
- Transaction/persistence/data writes: Not applicable for this patch.
- FE-BE variable passing: Applicable. The new shared limit pointer is threaded through the reviewed storage read path consistently.
- Performance: The intended I/O reduction path is clear and localized; I did not confirm an additional performance regression beyond the test/build issue below.
- Other issues: The BE UT compile failure below should be fixed before merging.
Overall opinion: not ready as-is because the added unit test appears to break the BE test build.
| // The segment pointer is null — only _opts and internal maps are accessed. | ||
| std::unique_ptr<SegmentIterator> make_iter() { | ||
| auto iter = std::make_unique<SegmentIterator>(nullptr, _read_schema); | ||
| iter->_opts.tablet_schema = _tablet_schema; |
There was a problem hiding this comment.
This new test reaches directly into SegmentIterator internals (_opts, _column_predicate_index_exec_status, and _can_opt_limit_reads()), but segment_iterator.h does not expose them to the test via friend/FRIEND_TEST or any public test hook. As written, this TU should fail to compile in the BE UT target because all of those members are private. Please add an explicit test hook/friend declaration or rewrite the test to exercise the behavior through a public interface.
|
run buildall |
|
/review |
There was a problem hiding this comment.
Found 1 correctness issue.
-
Goal of current task
Conclusion: Partially achieved. General LIMIT pushdown is implemented, but the new shared limit is also propagated into the ORDER BY key LIMIT path, where it can change results under parallel scan. Existing tests only cover single-bucket ORDER BY cases and therefore do not validate this path. -
Modification size and focus
Conclusion: The patch is focused, but it couples two semantically different paths: plain LIMIT (where a global remaining-row hint is safe) and per-scanner top-N collection (where it is not). -
Concurrency
Conclusion: The atomic accesses themselves are fine for a best-effort hint, but using the same shared counter as a hard stop for parallel top-N scanners is not concurrency-safe for correctness. A scanner that decrements the counter first can suppress other scanners that still hold better rows for the final top-N merge. -
Lifecycle / static initialization
Conclusion: No special lifecycle or static initialization issue found in the touched code. -
Configuration
Conclusion: No new configuration items were added. -
Compatibility
Conclusion: No storage-format or symbol compatibility issue found. -
Parallel code paths
Conclusion: There are parallel code paths here, and they should not be treated identically. Plain LIMIT pushdown and ORDER BY key LIMIT/top-N pushdown have different correctness constraints; this change should only apply the shared remaining-row budget to the former. -
Special conditional checks
Conclusion: The new_storage_no_merge()gate is not sufficient to make the top-N path safe. The missing condition is whether the path still requires each scanner to independently produce its own local top-N candidate set. -
Test coverage
Conclusion: Coverage is insufficient for the risky path. The new regression file adds multi-bucket tests for plain LIMIT, but the ORDER BY key tests are single-bucket only. No test exercises parallel ORDER BY key LIMIT with multiple scanners/tablets, which is exactly where this regression appears. -
Observability
Conclusion: No additional observability issue required for this review. -
Transaction / persistence
Conclusion: Not applicable; no transaction or persistence logic is changed here. -
Data writes / modifications
Conclusion: Not applicable; the change is read-path only. -
FE-BE variable passing
Conclusion: The newshared_scan_limitpointer is passed through the relevant BE layers consistently for the touched path. The issue is the semantic use of that variable in the top-N path, not missing propagation. -
Performance
Conclusion: The intended I/O reduction is reasonable for plain LIMIT. However, correctness takes priority, and the current top-N integration is unsafe. -
Other issues
Conclusion: No second confirmed bug beyond the incorrect top-N integration.
Overall opinion: the PR should not merge as-is because the shared remaining-row budget is being applied to a path where it can change query results.
| // iterator can tighten its per-batch row budget based on how many rows | ||
| // the whole query still needs, avoiding unnecessary I/O. | ||
| if (olap_scan_local_state->_storage_no_merge()) { | ||
| _tablet_reader_params.shared_scan_limit = _local_state->shared_scan_limit_ptr(); |
There was a problem hiding this comment.
Applying shared_scan_limit to the read_orderby_key_limit path breaks the correctness contract of parallel top-N scanning. In this branch, each scanner is supposed to produce its own local top-N candidate set (_limit_per_scanner), and only the upper operator merges those candidates into the final global top-N.
After this change, BetaRowsetReader passes the global remaining-row budget down into SegmentIterator::read_limit, so one scanner can exhaust the shared counter and cause other scanners to stop before they have produced their own local top-N rows. That can return the wrong result under parallel scan.
Concrete case: with parallel scan enabled, imagine tablet A contains keys 100..199 and tablet B contains keys 1..99, and the query is SELECT ... ORDER BY k LIMIT 10. If scanner A runs first and returns 10 rows, the shared counter reaches 0. Scanner B can then stop without returning its local top-10 rows (1..10), so the final merge never sees the real best rows.
This shared budget is safe as an I/O hint for plain unordered LIMIT, but it is not safe for the ORDER BY key LIMIT / per-scanner top-N path. The new regression tests do not cover this because the ORDER BY cases use a single-bucket table only.
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
2 similar comments
|
run buildall |
|
run buildall |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
Found 1 issue.
- [high]
be/src/exec/scan/olap_scanner.cpp: the refactor only disables_shared_scan_limitwhenread_orderby_key_limit > 0, butORDER BY key LIMITscans that still require merge reads keep the scanner-level shared budget enabled. In that path one scanner can exhaust the global budget before peer scanners emit their local top-N candidates, so the final merge can miss smaller keys from later scanners and return the wrong rows.
Critical checkpoints:
- Goal / correctness: Not fully satisfied. The change aims to unify the topn/general limit paths, but merge-required top-N can now under-produce rows.
- Minimality / clarity: Mostly focused, but the shared-limit condition became broader than the top-N correctness boundary.
- Concurrency: Applicable. The shared atomic coordination itself is race-safe enough here, but it is applied to the wrong top-N path.
- Lifecycle / static init: No issue found.
- Configuration: No new config.
- Compatibility: No storage/protocol compatibility issue found.
- Parallel code paths: Not all applicable paths were updated consistently; no-merge top-N disables shared limit, merge-required top-N does not.
- Special conditions: The new comments correctly say top-N must not use a global budget, but the implementation only enforces that for
read_orderby_key_limit. - Test coverage: New tests cover general limit and no-merge top-N, but miss merge-required
ORDER BY key LIMIT. - Observability: No additional observability issue found.
- Transaction / persistence: Not applicable.
- Data write / atomicity: Not applicable.
- FE-BE variable passing: Applicable only within the BE stack; propagation looks consistent there.
- Performance: No separate performance concern beyond the correctness regression.
- Other issues: None beyond the finding above.
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
9dfec1f to
a2f4d42
Compare
|
run buildall |
a2f4d42 to
dc43e59
Compare
|
run buildall |
1 similar comment
|
run buildall |
|
/review |
|
OpenCode automated review failed and did not complete. Error: Review step was failure (possibly timeout or cancelled) Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
|
/review |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
There was a problem hiding this comment.
Automated review completed against the current GitHub PR diff and the existing inline review context. I did not add duplicate inline comments for issues already raised, but the existing unresolved threads still include correctness/blocking concerns that should be addressed before merge.
Critical checkpoint conclusions:
- Goal/test proof: The PR aims to push SegmentIterator filter/LIMIT handling deeper into storage and add a new controlling session variable. Regression and BE unit coverage were added, but existing review threads still identify gaps around SET_VAR alias behavior and order/value-sensitive validation.
- Scope/focus: The change is focused on scan/storage LIMIT pushdown and related session plumbing, though it touches sensitive shared-limit and top-N paths.
- Concurrency: Shared scan-limit coordination is involved. Existing threads already cover liveness/overshoot and top-N shared-budget hazards; I found no additional distinct concurrency issue in the current PR diff.
- Lifecycle/static initialization: No new static initialization or special object lifecycle issue found beyond existing scanner/iterator ownership patterns.
- Configuration/session variables: A new FE/BE query option is added. Existing threads already cover compatibility and statement-scoped SET_VAR revert behavior for the deprecated alias.
- Compatibility/storage format: No storage format compatibility issue found. FE/BE query-option compatibility is already covered by existing comments.
- Parallel code paths: Cloud/non-cloud and top-N/general scan paths were considered; existing threads already cover the distinct MOR-as-DUP and top-N path concerns.
- Conditional checks: Several safety gates were added for runtime filters, top-N, and storage-no-merge. Existing comments cover the cases where those gates are insufficient or need tests.
- Test coverage/results: Tests were added, but existing threads already identify count-only/order-insensitive coverage gaps and alias SET_VAR coverage needs.
- Observability: No additional required observability issue found.
- Transactions/persistence/data writes: Not applicable; this PR does not change transaction persistence or write visibility semantics.
- FE/BE variables: New thrift/session variable plumbing exists; existing comments cover old/new variable mapping and compatibility.
- Performance: The intended optimization reduces storage reads; no additional distinct performance regression was found beyond correctness issues already raised.
User focus: No additional user-provided review focus was supplied.
No new non-duplicate inline comments were submitted in this review; please resolve the already-open blocking threads.
|
run buildall |
|
run buildall |
| if (remaining == 0) { | ||
| // Skip submitting more pending scanners once the LIMIT budget is | ||
| // exhausted; they would only open and immediately EOF. | ||
| if (_shared_scan_limit->load(std::memory_order_acquire) == 0) { |
There was a problem hiding this comment.
should be <= 0 like *eof = *eof || (_shared_scan_limit && _shared_scan_limit->load(std::memory_order_acquire) <= 0); the code
There was a problem hiding this comment.
The _shared_scan_limit here will not be nullptr. When there is no limit, it will be a negative number.
| bool enable_common_expr_pushdown() const { | ||
| return _query_options.__isset.enable_common_expr_pushdown && | ||
| _query_options.enable_common_expr_pushdown; | ||
| bool enable_segment_filter_and_limit_pushdown() const { |
There was a problem hiding this comment.
should only control limit not the conjucnt he filter
|
run buildall |
| return mode == ExprStorageFilterCheckMode::HAS_SEGMENT_EVALUABLE_EXPR || | ||
| !_is_key_column(slot_ref->expr_name()); | ||
| } | ||
| if (expr->is_virtual_slot_ref()) { |
There was a problem hiding this comment.
no need check the logic, the virtual slot should not valid because the array should not be key column
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 29675 ms |
TPC-DS: Total hot run time: 170954 ms |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 29299 ms |
TPC-DS: Total hot run time: 170743 ms |
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
TPC-H: Total hot run time: 29721 ms |
TPC-DS: Total hot run time: 173224 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
This pull request introduces significant improvements to scan operator logic, particularly enhancing the correctness and efficiency of LIMIT and predicate pushdown in OLAP scans. The main changes include a robust mechanism for sharing LIMIT counters among scanners, more precise control over predicate pushdown, and stricter validation for residual predicates. These changes help ensure that queries with LIMIT and TopN semantics return accurate results and avoid incorrect or inefficient execution paths.
Improvements to LIMIT handling and scanner coordination:
Introduced a shared atomic counter (
_shared_scan_limit) for LIMIT queries, allowing all scanners to collectively respect the global row limit. Scanners now check and update this counter to stop early when the LIMIT is reached, preventing over-scanning and improving efficiency. TopN scans bypass this mechanism as required. [1] [2] [3] [4] [5] [6] [7] [8] [9]Adjusted scanner and scan local state interfaces to support the new shared LIMIT mechanism, including new virtual methods and atomic counter management. [1] [2] [3]
Predicate pushdown and validation enhancements:
Refactored
_should_push_down_common_exprto require the expression as an argument and added logic to ensure only eligible expressions are pushed down, considering storage merge requirements and key columns. [1] [2] [3] [4]Added a validation step (
validate_residual_scan_conjuncts) to prevent unsupported residual predicates (like SEARCH or disabled MATCH) and to ensure correctness when using COUNT_ON_INDEX pushdown.OLAP scan and segment limit logic improvements:
Code cleanup and maintenance:
Dependency and include updates: