[FLINK-36964] Fix potential exception when SchemaChange in parallel w… #3818

lvyanquan · 2024-12-26T10:45:18Z

Fix potential exception when SchemaChange in parallel with Paimon Sink.

yuxiqian · 2024-12-27T01:33:17Z

Hi @lvyanquan, I have some concern about how BucketAssignOperator works with schema evolution stuff.

FlushEvent now works as a "pipeline drain indicator", which means after all data sink writer acknowledges them, there should't be any unhandled / uncommitted events flowing the whole pipeline, so SchemaRegistry could evolve downstream DB safely.

However, the bucket assigning strategy might break that assumption, where data sink writers might receive data change events with stale schema, even after external schema evolution processes have finished.

Potential schema operator hanging risks with "distributed" topology.

Basically same as described in #3680. In short, if a broadcast / custom partitioning topology is applied, then blocking one upstream partition will implicitly block all downstream partitions from handling events.

Why we need another Bucket assigner operator?

AFAIK all data change events have been hashed & distributed in PartitionOperator. Since adding a BucketAssignOperator does not change the parallelism, is there any reason we can't calculate the correct bucket partition ID in advance, instead of creating another partitioning topology?

yuxiqian · 2024-12-27T01:38:25Z

A little off topic, but we're really lacking E2e tests to real-life data sinks with complicated writing topologies. #3491 (with much higher testing pressure) might be necessary to expose more issues like this in the future.

lvyanquan · 2025-01-07T01:51:06Z

Hi @yuxiqian, could you help to review this?

yuxiqian

Thanks for @lvyanquan's great work, just left some comments.

flink-cdc-common/src/main/java/org/apache/flink/cdc/common/utils/SchemaUtils.java

...tor-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/v2/PaimonEventSink.java

...r-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/v2/PreCommitOperator.java

...rc/main/java/org/apache/flink/cdc/connectors/paimon/sink/v2/bucket/BucketAssignOperator.java

.../java/org/apache/flink/cdc/connectors/paimon/sink/v2/bucket/FlushEventAlignmentOperator.java

yuxiqian

It's nice to have corresponding E2e cases to verify changes in this PR. Could @leonardBang @ruanhang1993 please trigger the CI workflow?

lvyanquan · 2025-01-08T02:51:48Z

Last CI passed and rebase master to fix the conflict.

leonardBang · 2025-01-09T06:56:06Z

@lvyanquan Would you like to rebase to latest master branch as conflicts happens?

…ith Paimon Sink.

lvyanquan · 2025-01-10T11:09:23Z

Rebased to master.

leonardBang

Thanks @lvyanquan for the contribution and @yuxiqian for the review, LGTM

lvyanquan marked this pull request as draft December 26, 2024 10:45

github-actions bot added common paimon-pipeline-connector labels Dec 26, 2024

lvyanquan marked this pull request as ready for review January 6, 2025 15:34

yuxiqian suggested changes Jan 7, 2025

View reviewed changes

lvyanquan mentioned this pull request Jan 7, 2025

[FLINK-35888][cdc-connector][paimon] Add e2e test for PaimonDataSink. #3491

Closed

github-actions bot added e2e-tests and removed common labels Jan 7, 2025

lvyanquan force-pushed the FLINK-36964 branch from cb44b6f to 662c33e Compare January 7, 2025 08:25

yuxiqian approved these changes Jan 7, 2025

View reviewed changes

github-actions bot added the reviewed label Jan 7, 2025

lvyanquan force-pushed the FLINK-36964 branch from 0abe7e0 to cf08b66 Compare January 8, 2025 02:51

lvyanquan force-pushed the FLINK-36964 branch from cf08b66 to 566fc6a Compare January 10, 2025 02:51

lvyanquan and others added 4 commits January 10, 2025 19:08

[FLINK-36964] Fix potential exception when SchemaChange in parallel w…

b9be07a

…ith Paimon Sink.

[FLINK-35888][cdc-connector][paimon] Add e2e test for Paimon Sink.

7601988

fix ci.

db5a67c

rebase to master.

3a83d05

lvyanquan force-pushed the FLINK-36964 branch from 566fc6a to 3a83d05 Compare January 10, 2025 11:08

leonardBang approved these changes Jan 13, 2025

View reviewed changes

github-actions bot added the approved label Jan 13, 2025

leonardBang merged commit 75b8a0c into apache:master Jan 13, 2025
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-36964] Fix potential exception when SchemaChange in parallel w… #3818

[FLINK-36964] Fix potential exception when SchemaChange in parallel w… #3818

lvyanquan commented Dec 26, 2024 •

edited

Loading

yuxiqian commented Dec 27, 2024 •

edited

Loading

yuxiqian commented Dec 27, 2024 •

edited

Loading

lvyanquan commented Jan 7, 2025

yuxiqian left a comment

yuxiqian left a comment

lvyanquan commented Jan 8, 2025

leonardBang commented Jan 9, 2025

lvyanquan commented Jan 10, 2025

leonardBang left a comment

[FLINK-36964] Fix potential exception when SchemaChange in parallel w… #3818

[FLINK-36964] Fix potential exception when SchemaChange in parallel w… #3818

Conversation

lvyanquan commented Dec 26, 2024 • edited Loading

yuxiqian commented Dec 27, 2024 • edited Loading

yuxiqian commented Dec 27, 2024 • edited Loading

lvyanquan commented Jan 7, 2025

yuxiqian left a comment

Choose a reason for hiding this comment

yuxiqian left a comment

Choose a reason for hiding this comment

lvyanquan commented Jan 8, 2025

leonardBang commented Jan 9, 2025

lvyanquan commented Jan 10, 2025

leonardBang left a comment

Choose a reason for hiding this comment

lvyanquan commented Dec 26, 2024 •

edited

Loading

yuxiqian commented Dec 27, 2024 •

edited

Loading

yuxiqian commented Dec 27, 2024 •

edited

Loading