Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of a Series[int] with a complex returns Series[unknown] #1098

Open
loicdiridollou opened this issue Jan 14, 2025 · 5 comments
Open
Labels
good first issue Series Series data structure

Comments

@loicdiridollou
Copy link
Contributor

import pandas as pd

c = 1 + 1j
s = pd.Series([1.0, 2.0, 3.0])

check(assert_type(s + c, "pd.Series[complex]"), pd.Series) ■ "assert_type" mismatch: expected "Series[complex]" but received "Series[unknown]"

Please complete the following information:

OS: Darwin
OS Version [e.g. 22]: 15.2
python version: 3.12.7
version of type checker: mypy latest
version of installed pandas-stubs: latest
Additional context

version of pandas: 2.2.3
mypy option: strict=False

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Jan 14, 2025

Need to update the various operators to handle arguments that are complex

@loicdiridollou
Copy link
Contributor Author

loicdiridollou commented Jan 14, 2025

Also another issue about the __sub__ and __mul__ operators between two Series[int]:

s = pd.Series([0, 1, -10])
s2 = pd.Series([7, -5, 10])

check(assert_type(s - s2, "pd.Series[int]"), pd.Series, np.integer)
check(assert_type(s * s2,  "pd.Series[int]"), pd.Series, np.integer)
check(assert_type(s / s2, "pd.Series[float]"), pd.Series, np.float64)

@loicdiridollou
Copy link
Contributor Author

s1 = pd.Series([0, 1, 2, 3])
s2 = pd.Series([-1, 2, -3, 4])
df1 = pd.DataFrame([[0, 1], [-2, 3], [4, -5], [6, 7]])
n1 = np.array([[0, 1], [1, 2], [-1, -1], [2, 0]])
check(assert_type(s1.dot(s2), Scalar), np.int64)
check(assert_type(s1 @ s2, Scalar), np.int64)
check(assert_type(s1.dot(df1), "pd.Series[int]"), pd.Series, np.int64)
check(assert_type(s1.dot(s2), Scalar), np.integer)
check(assert_type(s1 @ s2, Scalar), np.integer)
check(assert_type(s1.dot(df1), "pd.Series[int]"), pd.Series, np.integer)  # should be float here as we don't know the type of the df

@shirzady1934
Copy link
Contributor

shirzady1934 commented Mar 13, 2025

Hey @Dr-Irv & @loicdiridollou I fixed this for add but, for mul

s = pd.Series([0, 1, -10])
s2 = pd.Series([7, -5, 10])

check(assert_type(s * s2,  "pd. Series[int]"), pd.Series, np.integer)

we cannot expect check(assert_type(s * s2, "pd.Series[int]"), pd.Series, np.integer) to pass without a type error because, as shown in

check(assert_type(s.apply(square), pd.Series), pd.Series, float)

the test verifies that the square of a Series returns a float. If both operands are treated as integers, this type mismatch will trigger an assertion failure.
can I remove this line?
and maybe another cause case like this?

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Mar 14, 2025

we cannot expect check(assert_type(s * s2, "pd.Series[int]"), pd.Series, np.integer) to pass without a type error because, as shown in

pandas-stubs/tests/test_series.py

Line 620 in 7328e89

check(assert_type(s.apply(square), pd.Series), pd.Series, float)
the test verifies that the square of a Series returns a float. If both operands are treated as integers, this type mismatch will trigger an assertion failure. can I remove this line? and maybe another cause case like this?

No, you can't.

The square() example is declared as multiplying 2 floats.

But if we have 2 Series[int] and we multiply them, we know we will get Series[int].

Getting this to work right is a big challenge.. @loicdiridollou was making progress in #1093, but we also have to make sure that we get the correct results when the subtype of Series is unknown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Series Series data structure
Projects
None yet
Development

No branches or pull requests

3 participants