Skip to content

RFC, feat: Add support for {Expr, Series}.struct.unnest#3523

Open
FBruzzesi wants to merge 12 commits intomainfrom
feat/struct-unnest
Open

RFC, feat: Add support for {Expr, Series}.struct.unnest#3523
FBruzzesi wants to merge 12 commits intomainfrom
feat/struct-unnest

Conversation

@FBruzzesi
Copy link
Copy Markdown
Member

Description

Drafting as RFC since I feel like it should be possible to simplify and reduce the computation calls. I struggled with a few details at start, I might need a set of fresh eyes

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

@FBruzzesi FBruzzesi added enhancement New feature or request nested data `list`, `struct`, etc labels Mar 27, 2026
@FBruzzesi FBruzzesi mentioned this pull request Apr 14, 2026
7 tasks
@FBruzzesi FBruzzesi marked this pull request as ready for review April 14, 2026 21:06
@MarcoGorelli
Copy link
Copy Markdown
Member

thanks - rather than Expr.struct.unnest, which returns multiple outputs (I don't think we have any other single input -> multiple output expressions), would it work to add DataFrame.unnest instead (https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.unnest.html)? @benrutter would that work for your usecase?

@benrutter
Copy link
Copy Markdown
Contributor

@MarcoGorelli I raised the feature request off the back of a discord chat, so don't actually have a specific use case for it, but the person asking about it (Jaw) said:

My usecases normally consist of flattening JSON data and making transformation over them, in the library is something like "Unnest" in polars? I've done a workaround of doing the select statements on the nested structures to flatten them, but it is not as convenient as doing "Unnest" in polars, is there any other alternatives?

So I think for a simple use case like that (i.e. loading in json, and parsing out structs into colunms) df.unnest would be fine?

(that said, if someone starts building a library with Narwhals aimed at unpacking json, it'd probably only be a matter of time before they hit up into something which makes them ask for Expr.struct.unnest, but maybe that's a problem for a future date - doesn't seem like anyone needs that currently)

@FBruzzesi
Copy link
Copy Markdown
Member Author

Thanks for your feedback @MarcoGorelli

rather than Expr.struct.unnest, which returns multiple outputs (I don't think we have any other single input -> multiple output expressions), would it work to add DataFrame.unnest instead

I find this a bit unexpected coming from you, we didn't consider {DataFrame, LazyFrame}.cast for the longest time exactly because it was a DataFrame/LazyFrame method rather than an expression one, and it was possible to implement via expression in the first place.

DataFrame.unnest

I would expect that for some backend the internals would not be so far off from what is implemented here. E.g. I couldn't find a way to do it directly on a pyarrow table (I didn't search for long in my defence though).

RE:

(I don't think we have any other single input -> multiple output expressions)

I guess it could be good to have one such case for any new potential contributor?!

@dangotbanned
Copy link
Copy Markdown
Member

(I don't think we have any other single input -> multiple output expressions)

It makes more sense once you de-sugar it (this is really just a Selector in disguise 😎)

@MarcoGorelli
Copy link
Copy Markdown
Member

I find this a bit unexpected coming from you, we didn't consider {DataFrame, LazyFrame}.cast for the longest time exactly because it was a DataFrame/LazyFrame method rather than an expression one, and it was possible to implement via expression in the first place.

😄 well you're right about that

It makes more sense once you de-sugar it (this is really just a Selector in disguise 😎)

ah i hadn't thought of it like that, thanks, can we get it be like the other selectors then?

@dangotbanned
Copy link
Copy Markdown
Member

It makes more sense once you de-sugar it (this is really just a Selector in disguise 😎)

ah i hadn't thought of it like that, thanks, can we get it be like the other selectors then?

🥳

One caveat though if we go in that direction:
It would be more like pl.col("1", "2") than cs.by_name("1", "2").

I documented this recently in (d231dd0) - since only the latter would return a Selector to the user

Feel free to ignore if this was obvious 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request nested data `list`, `struct`, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enh]: Add support for {Expr, Series}.struct.unnest

4 participants