Skip to content

crl-release-26.2: blob: flush a block immediately after a value that fills it#6097

Merged
RaduBerinde merged 1 commit into
cockroachdb:crl-release-26.2from
RaduBerinde:blob-flush-a-block-immediately-after-a-value-that-26.2
Jun 22, 2026
Merged

crl-release-26.2: blob: flush a block immediately after a value that fills it#6097
RaduBerinde merged 1 commit into
cockroachdb:crl-release-26.2from
RaduBerinde:blob-flush-a-block-immediately-after-a-value-that-26.2

Conversation

@RaduBerinde

Copy link
Copy Markdown
Member

FileWriter.EstimatedSize counts completed blocks at their compressed size
but the pending block at its uncompressed size. The pending block was only
flushed on the next AddValue (when the flush governor decided it was
full), so a value large enough to fill a block on its own lingered
uncompressed in the estimate until the following value arrived.

During flushes and value-separating compactions, EstimatedSize feeds the
output splitter's size accounting (via EstimatedReferenceSize). The
transient uncompressed overstatement after a large value could push the
estimate over the split threshold and cause premature output splits,
producing many small files.

AddValue now flushes the block immediately when the just-added value is at
least the flush governor's high watermark, so it is compressed before the
next EstimatedSize query.

A small unfinished block of normal-sized values is still carried
uncompressed, but that is negligible relative to the file size.

Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com

`FileWriter.EstimatedSize` counts completed blocks at their compressed size
but the pending block at its uncompressed size. The pending block was only
flushed on the *next* `AddValue` (when the flush governor decided it was
full), so a value large enough to fill a block on its own lingered
uncompressed in the estimate until the following value arrived.

During flushes and value-separating compactions, `EstimatedSize` feeds the
output splitter's size accounting (via `EstimatedReferenceSize`). The
transient uncompressed overstatement after a large value could push the
estimate over the split threshold and cause premature output splits,
producing many small files.

`AddValue` now flushes the block immediately when the just-added value is at
least the flush governor's high watermark, so it is compressed before the
next `EstimatedSize` query.

A small unfinished block of normal-sized values is still carried
uncompressed, but that is negligible relative to the file size.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@RaduBerinde RaduBerinde requested a review from sumeerbhola June 17, 2026 16:21
@RaduBerinde RaduBerinde requested a review from a team as a code owner June 17, 2026 16:21
@cockroach-teamcity

Copy link
Copy Markdown
Member

This change is Reviewable

@sumeerbhola sumeerbhola left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@sumeerbhola made 1 comment.
Reviewable status: 0 of 4 files reviewed, all discussions resolved.

@RaduBerinde RaduBerinde merged commit e50b05b into cockroachdb:crl-release-26.2 Jun 22, 2026
7 checks passed
@RaduBerinde RaduBerinde deleted the blob-flush-a-block-immediately-after-a-value-that-26.2 branch June 22, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants