Skip to content

refactor: Simplify _integer_fits_in_decimal; disallow supercasting for Datetime and Duration with different time_unit's#3526

Open
FBruzzesi wants to merge 3 commits intodtypes/supertypingfrom
dtypes/supertyping-feedback-adjustments
Open

refactor: Simplify _integer_fits_in_decimal; disallow supercasting for Datetime and Duration with different time_unit's#3526
FBruzzesi wants to merge 3 commits intodtypes/supertypingfrom
dtypes/supertyping-feedback-adjustments

Conversation

@FBruzzesi
Copy link
Copy Markdown
Member

Description

This PR has two commits:

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

Copy link
Copy Markdown
Member

@dangotbanned dangotbanned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @FBruzzesi

Just reviewed the Decimal part for now, thanks for taking that on board

I had an idea for TimeUnit but ran of time to explain 😂

praise be narwhal

Comment on lines +258 to +263
else:
bits: int = integer._bits
if isinstance(integer, SignedIntegerType):
bits = bits - 1

value = (1 << bits) - 1
Copy link
Copy Markdown
Member

@dangotbanned dangotbanned Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Literal math is so cool 😄

Really didn't expect it to work for numbers this large (or bit-shifting tbh)

image

Comment on lines 247 to +250
def _integer_fits_in_decimal(value: int, precision: int, scale: int) -> bool:
"""Scales an integer and checks if it fits the target precision."""
# !NOTE: Indexing is safe since `scale <= precision <= 38`
return (precision == DEC128_MAX_PREC) or (
value * POW10_LIST[scale] < POW10_LIST[precision]
)
return (precision == DEC128_MAX_PREC) or (value * (10**scale) < (10**precision))
Copy link
Copy Markdown
Member

@dangotbanned dangotbanned Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3396 (comment)

(1) Could we be lazy-er?

I would prefer if we defer generating this until it is needed.

E.g. I'd expect _integer_supertyping and _primitive_numeric_supertyping to be more commonly used - but even they don't exist at module-import-time

Sorry for not being clear here!
I think you are right to pre-compute these numbers.

Edit: ffs, reading this back really looks like AI.
Gonna leave it as-is, but I promise I wrote that
😭

How big are they?

AFAICT, these are all the possible inputs each parameter can have.

value: Literal[127, ..., 340282366920938463463374607431768211455]
#                   ^^^ 8 hidden values
precision: Literal[1, ..., 38]
scale: Literal[1, ..., 38]  # TIL: polars seems to allow `0`

So this makes the worst-case 😨:

(
    34028236692093846346337460743176821145500000000000000000000000000000000000000
    < 100000000000000000000000000000000000000
)

Suggestion

Show _{integer,primitive_numeric}_supertyping

@cache
def _integer_supertyping() -> Mapping[FrozenDTypes, type[Int | Float64]]:
"""Generate the supertype conversion table for all integer data type pairs."""
tps_int = SignedIntegerType.__subclasses__()
tps_uint = UnsignedIntegerType.__subclasses__()
get_bits: attrgetter[_Bits] = attrgetter("_bits")
ints = (
(frozen_dtypes(lhs, rhs), max(lhs, rhs, key=get_bits))
for lhs, rhs in product(tps_int, repeat=2)
)
uints = (
(frozen_dtypes(lhs, rhs), max(lhs, rhs, key=get_bits))
for lhs, rhs in product(tps_uint, repeat=2)
)
# NOTE: `Float64` is here because `mypy` refuses to respect the last overload 😭
# https://github.com/python/typeshed/blob/a564787bf23386e57338b750bf4733f3c978b701/stdlib/typing.pyi#L776-L781
ubits_to_int: Mapping[_Bits, type[Int | Float64]] = {8: Int16, 16: Int32, 32: Int64}
mixed = (
(
frozen_dtypes(int_, uint),
int_ if int_._bits > uint._bits else ubits_to_int.get(uint._bits, Float64),
)
for int_, uint in product(tps_int, tps_uint)
)
return dict(chain(ints, uints, mixed))
@cache
def _primitive_numeric_supertyping() -> Mapping[FrozenDTypes, type[Float]]:
"""Generate the supertype conversion table for all (integer, float) data type pairs."""
F32, F64 = Float32, Float64 # noqa: N806
small_int = (Int8, Int16, UInt8, UInt16)
small_int_f32 = ((frozen_dtypes(tp, F32), F32) for tp in small_int)
big_int_f32 = ((frozen_dtypes(tp, F32), F64) for tp in INTEGER.difference(small_int))
int_f64 = ((frozen_dtypes(tp, F64), F64) for tp in INTEGER)
return dict(chain(small_int_f32, big_int_f32, int_f64))

I referenced these guys because they do work to generate a dictionary - which may never end up being used.

Iff we need it, the result is cached and then we reuse from there 🙂

So the suggestion was just to move the global dict/list into a function


def _integer_fits_in_decimal(value: int, precision: int, scale: int) -> bool:
"""Scales an integer and checks if it fits the target precision."""
# !NOTE: Indexing is safe since `scale <= precision <= 38`
Copy link
Copy Markdown
Member

@dangotbanned dangotbanned Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# !NOTE: Indexing is safe since scale <= precision <= 38

I suppose this comment may be relevant again depending on (#3526 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants