Skip to content

collect_schema() succeeds for arr.* functions on non-array columns #25648

@mcrumiller

Description

@mcrumiller

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Issue Description

Calls to collect_schema() when the .arr namespace is applied to non-array columns succeed when they should error.

import polars as pl

lf = pl.LazyFrame({"a": ["a", "b", "c"]})

a = pl.col("a")

# The following raise an exception, as they should.
lf.select(a.arr.contains("z")).collect_schema()             # ERROR - OK
lf.select(a.arr.explode()).collect_schema()                 # ERROR - OK
lf.select(a.arr.agg(pl.element().len())).collect_schema()   # ERROR - OK
lf.select(a.arr.eval(pl.element().len())).collect_schema()  # ERROR - OK
lf.select(a.arr.sum()).collect_schema()                     # ERROR - OK
lf.select(a.arr.to_list()).collect_schema()                 # ERROR - OK
lf.select(a.arr.to_struct()).collect_schema()               # ERROR - OK
lf.select(a.arr.unique()).collect_schema()                  # ERROR - OK

# The following succeed when they should raise an exception.
lf.select(a.arr.all()).collect_schema()                     # Schema({'a': Boolean})
lf.select(a.arr.any()).collect_schema()                     # Schema({'a': Boolean})
lf.select(a.arr.arg_max()).collect_schema()                 # Schema({'a': UInt32})
lf.select(a.arr.arg_min()).collect_schema()                 # Schema({'a': UInt32})
lf.select(a.arr.count_matches("z")).collect_schema()        # Schema({'a': UInt32})
lf.select(a.arr.first()).collect_schema()                   # Schema({'a': Unknown})
lf.select(a.arr.get(0)).collect_schema()                    # Schema({'a': Unknown})
lf.select(a.arr.join("")).collect_schema()                  # Schema({'a': String})
lf.select(a.arr.last()).collect_schema()                    # Schema({'a': Unknown})
lf.select(a.arr.len()).collect_schema()                     # Schema({'a': Unknown})
lf.select(a.arr.max()).collect_schema()                     # Schema({'a': Unknown})
lf.select(a.arr.mean()).collect_schema()                    # Schema({'a': String})
lf.select(a.arr.median()).collect_schema()                  # Schema({'a': String})
lf.select(a.arr.min()).collect_schema()                     # Schema({'a': Unknown})
lf.select(a.arr.n_unique()).collect_schema()                # Schema({'a': UInt32})
lf.select(a.arr.reverse()).collect_schema()                 # Schema({'a': String})
lf.select(a.arr.shift(1)).collect_schema()                  # Schema({'a': String})
lf.select(a.arr.sort()).collect_schema()                    # Schema({'a': String})
lf.select(a.arr.std()).collect_schema()                     # Schema({'a': String})
lf.select(a.arr.var()).collect_schema()                     # Schema({'a': String})

Installed versions

main

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageAwaiting prioritization by a maintainerpythonRelated to Python Polars

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions