-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Search before asking
- I searched in the issues and found nothing similar.
Paimon version
1.3.1
Compute Engine
flink
Minimal reproduce step
We have an existing primary key table with dynamic bucket / cross-partition upsert enabled, for example:
WITH ( 'bucket' = '-1', 'dynamic-bucket.initial-buckets' = '64', 'dynamic-bucket.max-buckets' = '256', 'deletion-vectors.enabled' = 'true', 'merge-engine' = 'deduplicate' )
The table has already been written with data and is used by a running Flink job.
Later, we altered:
ALTER TABLE ... SET ( 'dynamic-bucket.initial-buckets' = '128' );
Then after restarting or recovering the Flink writing job, checkpoint failed during cross-partition bootstrap with errors like:
java.io.IOException: Could not perform checkpoint ... for operator cross-partition-bucket-assigner ... Caused by: java.lang.RuntimeException: Exception in bulkLoad, the most suspicious reason is that your data contains duplicates, please check your sink table. (The likelihood of duplication is that you used multiple jobs to write the same dynamic bucket table, it only supports single write) ... Caused by: org.rocksdb.RocksDBException: Keys must be added in strict ascending order.
What doesn't meet your expectations?
dynamic-bucket.initial-buckets seems to participate in assigner/bucket distribution logic for dynamic bucket tables.
After the table already contains data, altering this option changes the assigner/bucket mapping used by bootstrap / checkpoint path, and may finally trigger duplicate-key related failures during bootstrap bulk load.
From the user perspective, this option behaves like an immutable table option once the table is created and has data.
However, it is currently possible to change it on an existing table, and the failure only appears later when the Flink job runs and checkpoints, which is confusing and dangerous.
Anything else?
I think Paimon should reject altering dynamic-bucket.initial-buckets for existing dynamic bucket tables (bucket = -1), especially for cross-partition upsert scenarios.
Suggested behavior:
Treat dynamic-bucket.initial-buckets as an immutable option once the table is created.
Reject ALTER TABLE ... SET ('dynamic-bucket.initial-buckets' = ...) with a clear validation error.
Ideally document that this option is only intended for table creation / initial topology setup, not online adjustment for existing tables.
This would prevent users from accidentally corrupting runtime behavior and getting duplicate-related bootstrap/checkpoint failures later.
Are you willing to submit a PR?
- I'm willing to submit a PR!