Skip to content

feat: Add {Expr,Series}.str.to_time#3538

Open
camriddell wants to merge 13 commits intonarwhals-dev:mainfrom
camriddell:feat/expr.str.to_time
Open

feat: Add {Expr,Series}.str.to_time#3538
camriddell wants to merge 13 commits intonarwhals-dev:mainfrom
camriddell:feat/expr.str.to_time

Conversation

@camriddell
Copy link
Copy Markdown
Member

Some backends natively allow casting directly from strings to time types, others must step through a datetime (for parsing). These are exposed in Narwhals as the nw.Time() type

supported backends: pyarrow, pandas[pyarrow], duckdb, ibis, polars unsupported backends:

  • pandas & dask: no native time type
  • _spark_like: time type is not widely used, spark has a native time type, but sqlframe does not
import polars as pl
import narwhals as nw
df_native = pl.DataFrame({"a": ["12:59:21", "18:42:12"]})
df = nw.from_native(df_native)

df.select(nw.col("a").str.to_time(format="%H:%M:%S"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (2, 1)   |
|  ┌──────────┐    |
|a|
|---|
|time|
|  ╞══════════╡    |
|12:59:21|
|18:42:12|
|  └──────────┘    |
└──────────────────┘

Description

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

camriddell and others added 2 commits April 7, 2026 09:19
Some backends natively allow casting directly from strings to
time types, others must step through a datetime (for parsing).
These are exposed in Narwhals as the nw.Time() type

supported backends: pyarrow, pandas[pyarrow], duckdb, ibis, polars
unsupported backends:
- pandas & dask: no native time type
- _spark_like: time type is not widely used, spark has a native time
  type, but sqlframe does not
@camriddell
Copy link
Copy Markdown
Member Author

camriddell commented Apr 7, 2026

Test Failures

@MarcoGorelli some failures are due to an older version of Polars not auto-inferring the %H:%M format, which was added in pola-rs/polars#22606 (release Polars 1.30.0))

Should we:

  1. Implement our own backwards compat patch
  2. xfail for this particular time format for versions of Polars<=1.30

Edit

I went with option 2. from the above: xfail for this particular time format for versions of Polars<=1.30. Reverting is easy if we want to switch.

@camriddell camriddell marked this pull request as ready for review April 7, 2026 20:03
@dangotbanned

This comment was marked as resolved.

@camriddell
Copy link
Copy Markdown
Member Author

Some backends natively allow casting directly from strings to time types, others must step through a datetime (for parsing).

I think the same trick may work for polars prior to (pola-rs/polars#22606)

On further thought, I think we should just xfail this. The glue-code necessary here is going to be fragile since Polars does the time-format inferral step within the Rust layer making it hard to configure from our Python API. Also auto-inferral of Time formats is a nice-to

For some more details:

  1. Polars can auto-infer HH:MM:SS from String to Time before that PR, however that PR made it possible for Polars to also auto-infer HH:MM to Time.
  2. It doesn't seem that Polars can auto-infer a strict time format via .str.to_datetime either

MREs

Polars <=1.29.0 fails to auto-infer HH:MM (but successfully auto-infers HH:MM:SS)
❯ python << EOF
import polars as pl
print(f'{pl.__version__ = }')
print(pl.Series(['12:59:34']).str.to_time())
print(pl.Series(['12:59']).str.to_time())
EOF

pl.__version__ = '1.29.0'
shape: (1,)
Series: '' [time]
[
        12:59:34
]
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "/home/cameron/repos/opensource/narwhals-dev/.venv/lib/python3.14/site-packages/polars/series/utils.py", line 106, in wrapper
    return s.to_frame().select_seq(f(*args, **kwargs)).to_series()
           ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/home/cameron/repos/opensource/narwhals-dev/.venv/lib/python3.14/site-packages/polars/dataframe/frame.py", line 9682, in select_seq
    return self.lazy().select_seq(*exprs, **named_exprs).collect(_eager=True)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/home/cameron/repos/opensource/narwhals-dev/.venv/lib/python3.14/site-packages/polars/_utils/deprecation.py", line 93, in wrapper
    return function(*args, **kwargs)
  File "/home/cameron/repos/opensource/narwhals-dev/.venv/lib/python3.14/site-packages/polars/lazyframe/frame.py", line 2224, in collect
    return wrap_df(ldf.collect(engine, callback))
                   ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
polars.exceptions.ComputeError: could not find an appropriate format to parse times, please define a format
Polars <=1.29.0 supplying an explicit format is fine
❯ python << EOF
import polars as pl
print(f'{pl.__version__ = }')
print(pl.Series(['12:59:34']).str.to_time())
print(pl.Series(['12:59']).str.to_time(format='%H:%M'))
EOF
pl.__version__ = '1.29.0'
shape: (1,)
Series: '' [time]
[
        12:59:34
]
shape: (1,)
Series: '' [time]
[
        12:59:00
]
Polars >1.29.0 fails to auto-infer HH:MM (but successfully auto-infers HH:MM:SS)

Then upgrading to more recent Polars

❯ uv pip install polars --upgrade
Resolved 2 packages in 119ms
Prepared 1 package in 0.21ms
Uninstalled 1 package in 3ms
Installed 1 package in 4ms
 - polars==1.29.0
 + polars==1.39.3

❯ python << EOF
import polars as pl
print(f'{pl.__version__ = }')
print(pl.Series(['12:59:34']).str.to_time())
print(pl.Series(['12:59']).str.to_time())
EOF

pl.__version__ = '1.39.3'
shape: (1,)
Series: '' [time]
[
        12:59:34
]
shape: (1,)
Series: '' [time]
[
        12:59:00
]
Unfortunately `.str.to_datetime` doesn't auto-infer the time formats either
❯ python << EOF
import polars as pl
print(f'{pl.__version__ = }')
print(pl.Series(['12:59:34']).str.to_datetime())
print(pl.Series(['12:59']).str.to_datetime())
EOF
pl.__version__ = '1.39.3'
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/home/cameron/repos/opensource/narwhals-dev/.venv/lib/python3.14/site-packages/polars/series/string.py", line 168, in to_datetime
    self._s.str_to_datetime_infer(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        time_unit,
        ^^^^^^^^^^
    ...<2 lines>...
        ambiguous_s._s,
        ^^^^^^^^^^^^^^^
    )
    ^
polars.exceptions.ComputeError: could not find an appropriate format to parse dates, please define a format

@MarcoGorelli
Copy link
Copy Markdown
Member

On further thought, I think we should just xfail this

yup definitely

_spark_like: time type is not widely used, spark has a native time type, but sqlframe does not

is it ok to implement it for spark but xfail for sqlframe, and raise a feature request to sqlframe?

@camriddell
Copy link
Copy Markdown
Member Author

camriddell commented Apr 10, 2026

_spark_like: time type is not widely used, spark has a native time type, but sqlframe does not

is it ok to implement it for spark but xfail for sqlframe, and raise a feature request to sqlframe?

Just added a commit for sparks new TimeType, I don't think Spark fully supports this internally just yet (see apache/spark@ed92a5c

TIME data type support is not complete, so we need to guard it before it is completed, especially ahead of Spark 4.1 release.

In fact I even had to "opt-in" to Sparks support for our tests so I'm not sure if this is something we want to expose to users just yet since.

@MarcoGorelli
Copy link
Copy Markdown
Member

ah hadn't realised yet, thanks for explaining - ok to keep it unimplemented for spark then

@camriddell
Copy link
Copy Markdown
Member Author

ah hadn't realised yet, thanks for explaining - ok to keep it unimplemented for spark then

Once this lands, I can rebase a different branch that has the pyspark-relevant code that we can hold onto in a draft PR for posterity.

Copy link
Copy Markdown
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @camriddell 🚀 it looks great! Opened a discussion for what we should do in the pandas-like case and a tiny suggestion for the docstrings

Comment thread narwhals/_pandas_like/series_str.py Outdated
Comment on lines +96 to +103
if not is_dtype_pyarrow(self.native.dtype):
msg = (
"This operation requires a pyarrow-backed series. "
"Please refer to https://narwhals-dev.github.io/narwhals/api-reference/narwhals/#narwhals.maybe_convert_dtypes "
"and ensure you are using dtype_backend='pyarrow'. "
"Additionally, make sure you have pandas version 1.5+ and pyarrow installed. "
)
raise TypeError(msg)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we follow the same "pattern" that we use when converting from any pandas-like dtype to one that is supported only with pyarrow instead of directly requiring that the dtype is already pyarrow-backed?

I refer to:

def narwhals_to_native_arrow_dtype(
dtype: IntoDType, implementation: Implementation, version: Version
) -> pd.ArrowDtype:
if is_pandas_or_modin(implementation) and PANDAS_VERSION >= (2, 2):
try:
import pyarrow as pa # ignore-banned-import # noqa: F401
except ImportError as exc: # pragma: no cover
msg = (
f"Unable to convert to {dtype} due to the following exception: {exc.msg}"
)
raise ImportError(msg) from exc
from narwhals._arrow.utils import narwhals_to_native_dtype as _to_arrow_dtype
return pd.ArrowDtype(_to_arrow_dtype(dtype, version))
msg = ( # pragma: no cover

Copy link
Copy Markdown
Member Author

@camriddell camriddell Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the pattern in _pandas_like...str.spit.

Shall I change both this instance and the one in my PR to follow the reference in _pandas_like.utils.narwhals_to_native_arrow_dtype?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out! I guess it's some inconsistency!
I would be ok casting in behalf of the user in these cases, but happy to hear what others think about this

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, I let the mechanics of .cast(Time) handle the auto-conversion after stepping through the datetime dtype.

I also added an xfail for testing environments that are running the non-pyarrow pandas-like constructors but do not have pyarrow installed/available.

Comment thread narwhals/expr_str.py Outdated
Comment thread narwhals/series_str.py Outdated
camriddell and others added 2 commits April 13, 2026 10:11
Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>
Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>
@FBruzzesi FBruzzesi changed the title feat: add {expr,Series}.str.to_time(…) feat: Add {Expr,Series}.str.to_time Apr 14, 2026
@camriddell camriddell requested a review from FBruzzesi April 14, 2026 23:37
Copy link
Copy Markdown
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adjusting @camriddell - just a minor comment on how to skip tests for old pandas, and I think we can merge once addressed!

Comment thread tests/expr_and_series/str/to_time_test.py Outdated
@camriddell camriddell requested a review from FBruzzesi April 15, 2026 15:08
@camriddell
Copy link
Copy Markdown
Member Author

@MarcoGorelli was just thinking that I could leave the pyspark code in-place since they've already release an API for this feature and we can just xfail that test. Then once pyspark flips the switch to have "Time" fully supported we'll hit (hopefully) an xpass?

Copy link
Copy Markdown
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bearing with me @camriddell - It's a +1 from me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enh]: Support Expr.str.to_time

4 participants