-
Notifications
You must be signed in to change notification settings - Fork 701
feat: allow adding nested fields when refreshing schema #23531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: allow adding nested fields when refreshing schema #23531
Conversation
1619c35 to
eae9396
Compare
| Some(old_col) => { | ||
| // Column exists in both schemas, check compatibility | ||
| if let Err(_reason) = | ||
| is_protobuf_compatible(old_col.data_type(), new_col.data_type()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there is a problem here: you only check whether the fields are compatible with the Protobuf protocol in this generic method. However, the upstream schema does not guarantee that it is of Protobuf type (ie. for struct fields, it could also be generated by Avro or other possible nested types).
In addition, I believe that protobuf compatibility should not be checked within RisingWave, but rather ensured by external systems. Even if protobuf is not compatible, as long as the message itself and the schema are updated synchronously, RisingWave can still parse and compute it. This is because the essence of the protobuf compatibility issue arises from the uniqueness of binary serialization/deserialization, while RisingWave merely parses upstream messages into internal types. It does not concern itself with whether the schema is compatible over time, as long as the message itself is compatible with the schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @BugenZhao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think the name of the function is misleading here, which is in and of itself a good thing to fix, I've gone ahead and done that to avoid confusion. The function itself basically just checks based on the the internal DataType types and doesn't consider anything protobuf specific, I just had the compatibility semantics of protofuf in mind when first preparing it.
Question is (and I'm really asking, I'm admittedly in some deep waters here) is whether the semantics of the function hold up? Any potential downstream implications that I'm missing?
eae9396 to
c2a78c1
Compare
|
Hi and thanks for your contribution. Just want to share some context before we proceed with this PR: adding nested fields is actually supported in risingwave/src/frontend/src/handler/create_table/col_id_gen.rs Lines 74 to 118 in 995c695
Would you help to check whether we can directly reuse such utility for |
|
Thanks for the added info @BugenZhao! Looks like I'll need to get back to this next week, but happy to look into whether the code you linked can be re-used, at least to some degree. |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
This enables adding new fields to nested struct/message types on events and running
ALTER SOURCE ... REFRESH SCHEMA(and related ALTER SOURCE statements). Previously this was only possible if fields were added at the root level of the event structure.To achieve this, the schema compatibility check was re-implemented. An additional benefit is better error messages when a breaking change is made to an individual field.
See this Slack discussion for some more context: https://risingwave-community.slack.com/archives/C03BW71523T/p1760962872810179
This is my first PR here so I'd be quite surprised if it's good on the first attempt, so any feedback would be much appreciated :)
Checklist
Documentation
Release note