crl-release-26.2: blob: flush a block immediately after a value that fills it#6097
Merged
RaduBerinde merged 1 commit intoJun 22, 2026
Conversation
`FileWriter.EstimatedSize` counts completed blocks at their compressed size but the pending block at its uncompressed size. The pending block was only flushed on the *next* `AddValue` (when the flush governor decided it was full), so a value large enough to fill a block on its own lingered uncompressed in the estimate until the following value arrived. During flushes and value-separating compactions, `EstimatedSize` feeds the output splitter's size accounting (via `EstimatedReferenceSize`). The transient uncompressed overstatement after a large value could push the estimate over the split threshold and cause premature output splits, producing many small files. `AddValue` now flushes the block immediately when the just-added value is at least the flush governor's high watermark, so it is compressed before the next `EstimatedSize` query. A small unfinished block of normal-sized values is still carried uncompressed, but that is negligible relative to the file size. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Member
sumeerbhola
approved these changes
Jun 17, 2026
sumeerbhola
left a comment
Contributor
There was a problem hiding this comment.
@sumeerbhola made 1 comment.
Reviewable status: 0 of 4 files reviewed, all discussions resolved.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
FileWriter.EstimatedSizecounts completed blocks at their compressed sizebut the pending block at its uncompressed size. The pending block was only
flushed on the next
AddValue(when the flush governor decided it wasfull), so a value large enough to fill a block on its own lingered
uncompressed in the estimate until the following value arrived.
During flushes and value-separating compactions,
EstimatedSizefeeds theoutput splitter's size accounting (via
EstimatedReferenceSize). Thetransient uncompressed overstatement after a large value could push the
estimate over the split threshold and cause premature output splits,
producing many small files.
AddValuenow flushes the block immediately when the just-added value is atleast the flush governor's high watermark, so it is compressed before the
next
EstimatedSizequery.A small unfinished block of normal-sized values is still carried
uncompressed, but that is negligible relative to the file size.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com