Skip to content

[core] Fix that latestSnapshotOfUser might list snapshot directory to find the earliest#7572

Merged
JingsongLi merged 1 commit intoapache:masterfrom
yuzelin:fix-latestSnapshotOfUser
Apr 1, 2026
Merged

[core] Fix that latestSnapshotOfUser might list snapshot directory to find the earliest#7572
JingsongLi merged 1 commit intoapache:masterfrom
yuzelin:fix-latestSnapshotOfUser

Conversation

@yuzelin
Copy link
Copy Markdown
Contributor

@yuzelin yuzelin commented Apr 1, 2026

Purpose

Previously, latestSnapshotOfUser called earliestSnapshotId() to determine the loop bound before iterating snapshots.

During checkpoint, if the snapshot pointed to by the EARLIEST hint is concurrently expired, the hint misses and earliestSnapshotId() falls back to findByListFiles(), which lists the entire snapshot directory.

This method is called during every checkpoint via prepareCommit → createWriterCleanChecker → latestCommittedIdentifier → latestSnapshotOfUserFromFilesystem. For jobs with high parallelism and a large number of snapshots (e.g. 10000+), this might cause massive listStatus requests, easily triggering QPS limits on object storage (e.g. OSS QpsLimitExceeded).

The fix removes the earliestSnapshotId() call and instead iterates backward from the latest snapshot, stopping when a FileNotFoundException is encountered (indicating an expired snapshot). Other exceptions are thrown directly.

Tests

testLatestSnapshotOfUser can test the loop can break.

@JingsongLi
Copy link
Copy Markdown
Contributor

+1

@JingsongLi JingsongLi merged commit 1e8a0f1 into apache:master Apr 1, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants