Skip to content

Add ObjectStore-backed TempFileFactor / spill example#23170

Draft
alamb wants to merge 14 commits into
apache:mainfrom
alamb:codex/object-store-spill-example
Draft

Add ObjectStore-backed TempFileFactor / spill example#23170
alamb wants to merge 14 commits into
apache:mainfrom
alamb:codex/object-store-spill-example

Conversation

@alamb

@alamb alamb commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

PR #21882 adds custom spill file support.

Having an example showing how a downstream application can use this new API will help make sure the API is good enough for our needs

What changes are included in this PR?

This PR adds a small example showing how users can back spill files with an ObjectStore, using a local object store for a runnable example while keeping the implementation applicable to remote stores.

Are these changes tested?

Yes by CI

Are there any user-facing changes?

Yes. This adds a new example for configuring ObjectStore-backed spill files.

@github-actions github-actions Bot added documentation Improvements or additions to documentation execution Related to the execution crate physical-plan Changes to the physical-plan crate labels Jun 24, 2026
@github-actions

Copy link
Copy Markdown

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details
     Cloning apache/main
    Building datafusion-execution v54.0.0 (current)
       Built [  34.808s] (current)
     Parsing datafusion-execution v54.0.0 (current)
      Parsed [   0.024s] (current)
    Building datafusion-execution v54.0.0 (baseline)
       Built [  28.732s] (baseline)
     Parsing datafusion-execution v54.0.0 (baseline)
      Parsed [   0.025s] (baseline)
    Checking datafusion-execution v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.228s] 223 checks: 221 pass, 2 fail, 0 warn, 30 skip

--- failure enum_variant_added: enum variant added on exhaustive enum ---

Description:
A publicly-visible enum without #[non_exhaustive] has a new variant.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#enum-variant-new
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/enum_variant_added.ron

Failed in:
  variant DiskManagerMode:Custom in /home/runner/work/datafusion/datafusion/datafusion/execution/src/disk_manager.rs:135

--- failure inherent_method_missing: pub method removed or renamed ---

Description:
A publicly-visible method or associated fn is no longer available under its prior name. It may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/inherent_method_missing.ron

Failed in:
  RefCountedTempFile::update_disk_usage, previously in file /home/runner/work/datafusion/datafusion/target/semver-checks/git-apache_main/da9c36f68aa514eb3a5b072f3ce858e127317294/datafusion/execution/src/disk_manager.rs:409
  RefCountedTempFile::current_disk_usage, previously in file /home/runner/work/datafusion/datafusion/target/semver-checks/git-apache_main/da9c36f68aa514eb3a5b072f3ce858e127317294/datafusion/execution/src/disk_manager.rs:457

     Summary semver requires new major version: 2 major and 0 minor checks failed
    Finished [  64.959s] datafusion-execution
    Building datafusion-physical-plan v54.0.0 (current)
       Built [  35.371s] (current)
     Parsing datafusion-physical-plan v54.0.0 (current)
      Parsed [   0.130s] (current)
    Building datafusion-physical-plan v54.0.0 (baseline)
       Built [  35.386s] (baseline)
     Parsing datafusion-physical-plan v54.0.0 (baseline)
      Parsed [   0.128s] (baseline)
    Checking datafusion-physical-plan v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.611s] 223 checks: 222 pass, 1 fail, 0 warn, 30 skip

--- failure function_missing: pub fn removed or renamed ---

Description:
A publicly-visible function cannot be imported by its prior path. A `pub use` may have been removed, or the function itself may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/function_missing.ron

Failed in:
  function datafusion_physical_plan::spill::spill_record_batch_by_size, previously in file /home/runner/work/datafusion/datafusion/target/semver-checks/git-apache_main/da9c36f68aa514eb3a5b072f3ce858e127317294/datafusion/physical-plan/src/spill/mod.rs:255

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  72.633s] datafusion-physical-plan

@github-actions github-actions Bot added the auto detected api change Auto detected API change label Jun 24, 2026
@alamb alamb force-pushed the codex/object-store-spill-example branch from 4dfdfcb to e97ed5d Compare June 24, 2026 22:16
@alamb alamb changed the title Add ObjectStore-backed spill example Add ObjectStore-backed TempFileFactor / spill example Jun 24, 2026

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// specific language governing permissions and limitations
// under the License.

//! See `main.rs` for how to run it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the whole point of this PR: an example of implementing remote spilling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto detected api change Auto detected API change documentation Improvements or additions to documentation execution Related to the execution crate physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants