Skip to content

fix(compaction): make sure that only one task is running at a time.#201

Open
lucasfang wants to merge 4 commits intoalibaba:mainfrom
lucasfang:compaction_dev
Open

fix(compaction): make sure that only one task is running at a time.#201
lucasfang wants to merge 4 commits intoalibaba:mainfrom
lucasfang:compaction_dev

Conversation

@lucasfang
Copy link
Copy Markdown
Collaborator

Purpose

Linked issue: close #xxx

Tests

API and Format

Documentation

Generative AI tooling

Copilot AI review requested due to automatic review settings March 27, 2026 03:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves compaction task lifecycle handling to prevent overlapping compaction executions by ensuring outstanding compaction work is joined before starting a new task or closing writers.

Changes:

  • Make CancelCompaction() synchronously wait for the active compaction future to finish, removing the “detached futures” list.
  • Update writer close-path comments to reflect “cancel + wait” behavior.
  • Remove tracking/waiting of previously “cancelled” futures in the compaction future manager.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/paimon/core/mergetree/merge_tree_writer.h Updates close-path comments to describe cancel + wait behavior.
src/paimon/core/compact/compact_future_manager.h Changes cancellation semantics to move+wait the active future; removes cancelled futures tracking.
src/paimon/core/append/append_only_writer.cpp Updates close-path comments to describe cancel + wait behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 33 to 43
/// Cancel the current compaction task if it is running.
/// @note: This method may leave behind orphan files.
void CancelCompaction() override {
// std::future does not support cancellation natively.
// Move away the active future first, then wait for completion so caller
// can safely start a new task without reusing cancel state.
if (task_future_.valid()) {
// Detach the future so we don't block on destruction
cancelled_futures_.push_back(std::move(task_future_));
auto cancelled = std::move(task_future_);
cancelled.wait();
}
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CancelCompaction() now blocks until the running task finishes, which conflicts with the method name and docstring (“Cancel…” implies a quick/interrupting action). To make the API behavior explicit for callers, either (mandatory) update the comment/docs to clearly state this is a join/wait operation, or (better) split into two APIs (e.g., RequestCancel() and WaitForCompactionToExit() / CancelAndWait()), so call sites can choose non-blocking cancellation vs. synchronous shutdown.

Copilot uses AI. Check for mistakes.
Comment on lines 87 to 89
// Request cancellation and wait for running compaction to exit.
// This avoids reusing cancellation state while an old task is still running.
compact_manager_->CancelCompaction();
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description still contains placeholders (e.g., “Linked issue: close #xxx” and empty Tests section). Please update the PR metadata with the actual linked issue and the tests run/added for this behavior change so reviewers can validate intent and verification steps.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


#include "arrow/array/array_base.h"
#include "arrow/array/builder_binary.h"
#include "arrow/array/builder_primitive.h"
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test uses arrow::StructBuilder and arrow::StringBuilder, but the includes only bring in builder_primitive.h. In this repo, StringBuilder comes from arrow/array/builder_binary.h and StructBuilder from arrow/array/builder_nested.h (see e.g. src/paimon/common/data/record_batch_test.cpp includes). As-is this file is likely to fail to compile due to missing Arrow builder headers; please include the proper Arrow builder headers (and drop builder_primitive.h if unused).

Suggested change
#include "arrow/array/builder_primitive.h"
#include "arrow/array/builder_binary.h"
#include "arrow/array/builder_nested.h"

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +19

#include "arrow/array/array_base.h"
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new test file uses std::map (CreateFactoryForOptions) and Arrow Struct/String builders, but it doesn't include , arrow/array/builder_binary.h (StringBuilder), or arrow/array/builder_nested.h (StructBuilder). This will likely fail to compile; please add the missing includes (and keep builder_primitive.h only for numeric builders like Int64Builder).

Suggested change
#include "arrow/array/array_base.h"
#include <map>
#include "arrow/array/array_base.h"
#include "arrow/array/builder_binary.h"
#include "arrow/array/builder_nested.h"

Copilot uses AI. Check for mistakes.
#include "paimon/core/mergetree/lookup/positioned_key_value.h"
#include "paimon/core/mergetree/lookup_levels.h"
#include "paimon/core/schema/table_schema.h"
#include "paimon/core/utils/file_store_path_factory.h"
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge_tree_compact_manager_factory.cpp instantiates FileStorePathFactoryCache (std::make_shared(...)), but it only includes file_store_path_factory.h. FileStorePathFactoryCache is defined in core/utils/file_store_path_factory_cache.h, so this translation unit is likely missing a required include and may not compile. Please include the cache header explicitly.

Suggested change
#include "paimon/core/utils/file_store_path_factory.h"
#include "paimon/core/utils/file_store_path_factory.h"
#include "paimon/core/utils/file_store_path_factory_cache.h"

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants