Skip to content

Align metadata propagation through Physical and Logical casts#23169

Open
paleolimbot wants to merge 2 commits into
apache:mainfrom
paleolimbot:cast-metadata-cleanup
Open

Align metadata propagation through Physical and Logical casts#23169
paleolimbot wants to merge 2 commits into
apache:mainfrom
paleolimbot:cast-metadata-cleanup

Conversation

@paleolimbot

@paleolimbot paleolimbot commented Jun 24, 2026

Copy link
Copy Markdown
Member

Which issue does this PR close?

Rationale for this change

The logical Expr::Cast and Expr::TryCast have a FieldRef target that was added in #18136 so that logical casts can express a cast to an extension type. In combination with a SQL type planner ( #20676 ) and an optimizer rule, this enabled casting to/from extension types with custom semantics to actually occur. The ability to do this was reverted by #20836 (which removed the original test) and I am not sure that ability ever made it into a release. When investigating this issue, it became clear the logical and physical cast behaviour had diverged with respect to the target field.

What changes are included in this PR?

This PR strips specific metadata keys (extension name and extension metadata) when propagating metadata from the source of a cast to the target (because doing so may result in an invalid destination field that consumers could reject), and propagates all metadata from the (logical) cast target field (e.g., so that a cast to an extension type represented by the cast target field will have a to_field() that communicates the extension type).

For the physical cast, this behaviour is replicated exactly (I hope).

Note that actually casting to an extension type can be implemented with an optimizer rule, planner, or by the mechanism I have in the works in #21071 .

Are these changes tested?

Yes

Are there any user-facing changes?

It was in practice not common to create a Expr::Cast with field metadata internally and thus I don't think users will see metadata changes from the inclusion of metadata from the target field. I would be surprised if stripping the extension name/metadata from the source was disruptive (it was more likely to have caused errors).

Superceeds an earlier but similar attempt ( #22162 ).

@github-actions github-actions Bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates labels Jun 24, 2026
@paleolimbot paleolimbot force-pushed the cast-metadata-cleanup branch from 7662d3a to c74dd77 Compare June 25, 2026 19:58
Comment on lines +67 to +70
/// Whether to preserve non-extension metadata from the source field.
/// When true (default), source metadata is merged with target metadata.
/// When false, only the target field's metadata is used.
preserve_source_metadata: bool,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little gross feeling to me, but there's at least one place in the code that is counting on Cast::return_field() not to execute its child's return_field(): the virtual row number rewrite requires that the cast strips metadata and does not validate its child's Column (which is invalid, at least where it's tested). There may be a better way here.

@paleolimbot paleolimbot marked this pull request as ready for review June 26, 2026 19:37
@paleolimbot paleolimbot changed the title Fix extension type metadata propagation through casts Align metadata propagation through Physical and Logical casts Jun 26, 2026
@paleolimbot

Copy link
Copy Markdown
Member Author

@adriangb @tschwarzinger @cyberbeam524 You've all kindly waited for me to finish this...happy to iterate on any of your comments here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Propagation of metadata through casts can strips extension type from the destination field and can result in invalid extension types

1 participant