fix(compaction): make sure that only one task is running at a time.#201
fix(compaction): make sure that only one task is running at a time.#201lucasfang wants to merge 4 commits intoalibaba:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Improves compaction task lifecycle handling to prevent overlapping compaction executions by ensuring outstanding compaction work is joined before starting a new task or closing writers.
Changes:
- Make
CancelCompaction()synchronously wait for the active compaction future to finish, removing the “detached futures” list. - Update writer close-path comments to reflect “cancel + wait” behavior.
- Remove tracking/waiting of previously “cancelled” futures in the compaction future manager.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/paimon/core/mergetree/merge_tree_writer.h | Updates close-path comments to describe cancel + wait behavior. |
| src/paimon/core/compact/compact_future_manager.h | Changes cancellation semantics to move+wait the active future; removes cancelled futures tracking. |
| src/paimon/core/append/append_only_writer.cpp | Updates close-path comments to describe cancel + wait behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Cancel the current compaction task if it is running. | ||
| /// @note: This method may leave behind orphan files. | ||
| void CancelCompaction() override { | ||
| // std::future does not support cancellation natively. | ||
| // Move away the active future first, then wait for completion so caller | ||
| // can safely start a new task without reusing cancel state. | ||
| if (task_future_.valid()) { | ||
| // Detach the future so we don't block on destruction | ||
| cancelled_futures_.push_back(std::move(task_future_)); | ||
| auto cancelled = std::move(task_future_); | ||
| cancelled.wait(); | ||
| } | ||
| } |
There was a problem hiding this comment.
CancelCompaction() now blocks until the running task finishes, which conflicts with the method name and docstring (“Cancel…” implies a quick/interrupting action). To make the API behavior explicit for callers, either (mandatory) update the comment/docs to clearly state this is a join/wait operation, or (better) split into two APIs (e.g., RequestCancel() and WaitForCompactionToExit() / CancelAndWait()), so call sites can choose non-blocking cancellation vs. synchronous shutdown.
| // Request cancellation and wait for running compaction to exit. | ||
| // This avoids reusing cancellation state while an old task is still running. | ||
| compact_manager_->CancelCompaction(); |
There was a problem hiding this comment.
PR description still contains placeholders (e.g., “Linked issue: close #xxx” and empty Tests section). Please update the PR metadata with the actual linked issue and the tests run/added for this behavior change so reviewers can validate intent and verification steps.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| #include "arrow/array/array_base.h" | ||
| #include "arrow/array/builder_binary.h" | ||
| #include "arrow/array/builder_primitive.h" |
There was a problem hiding this comment.
The test uses arrow::StructBuilder and arrow::StringBuilder, but the includes only bring in builder_primitive.h. In this repo, StringBuilder comes from arrow/array/builder_binary.h and StructBuilder from arrow/array/builder_nested.h (see e.g. src/paimon/common/data/record_batch_test.cpp includes). As-is this file is likely to fail to compile due to missing Arrow builder headers; please include the proper Arrow builder headers (and drop builder_primitive.h if unused).
| #include "arrow/array/builder_primitive.h" | |
| #include "arrow/array/builder_binary.h" | |
| #include "arrow/array/builder_nested.h" |
|
|
||
| #include "arrow/array/array_base.h" |
There was a problem hiding this comment.
This new test file uses std::map (CreateFactoryForOptions) and Arrow Struct/String builders, but it doesn't include , arrow/array/builder_binary.h (StringBuilder), or arrow/array/builder_nested.h (StructBuilder). This will likely fail to compile; please add the missing includes (and keep builder_primitive.h only for numeric builders like Int64Builder).
| #include "arrow/array/array_base.h" | |
| #include <map> | |
| #include "arrow/array/array_base.h" | |
| #include "arrow/array/builder_binary.h" | |
| #include "arrow/array/builder_nested.h" |
| #include "paimon/core/mergetree/lookup/positioned_key_value.h" | ||
| #include "paimon/core/mergetree/lookup_levels.h" | ||
| #include "paimon/core/schema/table_schema.h" | ||
| #include "paimon/core/utils/file_store_path_factory.h" |
There was a problem hiding this comment.
merge_tree_compact_manager_factory.cpp instantiates FileStorePathFactoryCache (std::make_shared(...)), but it only includes file_store_path_factory.h. FileStorePathFactoryCache is defined in core/utils/file_store_path_factory_cache.h, so this translation unit is likely missing a required include and may not compile. Please include the cache header explicitly.
| #include "paimon/core/utils/file_store_path_factory.h" | |
| #include "paimon/core/utils/file_store_path_factory.h" | |
| #include "paimon/core/utils/file_store_path_factory_cache.h" |
Purpose
Linked issue: close #xxx
Tests
API and Format
Documentation
Generative AI tooling