-
-
Notifications
You must be signed in to change notification settings - Fork 141
GH1180 Clean up all/any methods for Series and DataFrame #1188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should remove the FIXME
comments. Otherwise OK
pandas-stubs/core/frame.pyi
Outdated
@@ -1660,7 +1660,8 @@ class DataFrame(NDFrame, OpsMixin, _GetItemHack): | |||
bool_only: _bool | None = ..., | |||
skipna: _bool = ..., | |||
**kwargs: Any, | |||
) -> _bool: ... | |||
) -> np.bool: ... | |||
# FIXME the type below is not correct, should be pd.Series[np.bool] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I look at it is that we are using Series[bool]
to correspond to whatever bool
is stored inside - being a python
one, a numpy
one, or even if we use BooleanDtype
, so I don't think this comment is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the type checker will not accept that, because pd.Series can only be subscribed with S1 which contains bool (the generic boolean from python) but not np.bool.
Yet pd.DataFrame.any will return at runtime pd.Series[np.bool] (but this is not accepted since np.bool is not a subtype of S1).
Let me know if this is not clear,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type returned from DataFrame.any()
and DataFrame.all()
should be Series[_bool]
. Then the tests should use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My issue is that _bool does not contain np.bool
which is the type we get at runtime. Happy to keep the stubs as is but that would mean that runtime type does not align with static type.
Or we can add np.bool to _bool, open to suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters. We have checks like this:
check(assert_type(df.any(), "pd.Series[bool]"), pd.Series, np.bool_)
So even though np.bool_
is in the Series
, we call the type of the Series[bool]
It's similar to this:
check(assert_type(df.value_counts(), "pd.Series[int]"), pd.Series, np.integer)
What's inside the series are numpy integers, but we call that Series[int]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I see your point, let me fix the comments, thanks for the more detailed vision on this issue!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @loicdiridollou
assert_type()
to assert the type of any return valueTechnically we should fix the
pd.DataFrame.all
but that would mean having apd.Series[np.bool]
, yetnp.bool
is not a type ofS1
so will leave it there for a bit, it has some FIXME statements there for us to know.