Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame.drop with columns=set(...) is unspecified #1008

Closed
cmp0xff opened this issue Sep 25, 2024 · 4 comments · Fixed by #1131
Closed

DataFrame.drop with columns=set(...) is unspecified #1008

cmp0xff opened this issue Sep 25, 2024 · 4 comments · Fixed by #1131
Labels
pandas_docs For issues where there is a conflict in behavior with pandas docs and stubs that needs resolution

Comments

@cmp0xff
Copy link
Contributor

cmp0xff commented Sep 25, 2024

Describe the bug

DataFrame.drop with columns=set(...) is unspecified.

To Reproduce

  1. Provide a minimal runnable pandas example that is not properly checked by the stubs.
import pandas as pd

df = pd.DataFrame({1: [2], 3: [4]})
df = df.drop(columns={1})
  1. I am using mypy type checker.
  2. The error message received from that type checker.
df_drop.py:4:6: error: No overload variant of "drop" of "NDFrame" matches argument type "set[int]"  [call-overload]
df_drop.py:4:6: note: Possible overload variants:
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any] = ..., columns: Hashable | Sequence[Hashable] | Index[Any], level: Hashable | int | None = ..., inplace: Literal[True], errors: Literal['ignore', 'raise'] = ...) -> None
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any], columns: Hashable | Sequence[Hashable] | Index[Any] = ..., level: Hashable | int | None = ..., inplace: Literal[True], errors: Literal['ignore', 'raise'] = ...) -> None
df_drop.py:4:6: note:     def drop(self, labels: Hashable | Sequence[Hashable] | Index[Any], *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: None = ..., columns: None = ..., level: Hashable | int 
| None = ..., inplace: Literal[True], errors: Literal['ignore', 'raise'] = ...) -> None
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any] = ..., columns: Hashable | Sequence[Hashable] | Index[Any], level: Hashable | int | None = ..., inplace: Literal[False] = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any], columns: Hashable | Sequence[Hashable] | Index[Any] = ..., level: Hashable | int | None = ..., inplace: Literal[False] = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame
df_drop.py:4:6: note:     def drop(self, labels: Hashable | Sequence[Hashable] | Index[Any], *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: None = ..., columns: None = ..., level: Hashable | int 
| None = ..., inplace: Literal[False] = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any] = ..., columns: Hashable | Sequence[Hashable] | Index[Any], level: Hashable | int | None = ..., inplace: bool = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame | None
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any], columns: Hashable | Sequence[Hashable] | Index[Any] = ..., level: Hashable | int | None = ..., inplace: bool = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame | None
df_drop.py:4:6: note:     def drop(self, labels: Hashable | Sequence[Hashable] | Index[Any], *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: None = ..., columns: None = ..., level: Hashable | int 
| None = ..., inplace: bool = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame | None
Found 1 error in 1 file (checked 1 source file)

Please complete the following information

  • OS: Windows
  • OS Version:
[System.Environment]::OSVersion.Version
Major  Minor  Build  Revision
-----  -----  -----  --------
10     0      19045  0
  • python version: Python 3.11.9
  • version of type checker: mypy 1.11.2 (compiled: yes)
  • version of installed pandas-stubs: pandas-stubs==2.2.2.240807

Additional context

Nope

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Sep 25, 2024

First, your example is not correct. It should be:

import pandas as pd

df = pd.DataFrame({1: [2], 3: [4]})   # Fix is here
df = df.drop(columns={1})

Secondly, the pandas documentation says that the argument for columns is "single label or list-like". While your code works, it is not clear that it should. The stubs follows what is documented and a set is not "list-like".

I've added a reference to a pandas issue pandas-dev/pandas#59890 to see what the pandas developers say there.

@Dr-Irv Dr-Irv added the pandas_docs For issues where there is a conflict in behavior with pandas docs and stubs that needs resolution label Jan 10, 2025
@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Feb 26, 2025

based on discussion at dev meeting on 2/26/2025: modify docs to say that any Iterable is acceptable so stubs should say the same

@cmp0xff
Copy link
Contributor Author

cmp0xff commented Feb 26, 2025

based on discussion at dev meeting on 2/26/2025: modify docs to say that any Iterable is acceptable so stubs should say the same

Hi, thank you for following up the issue in the dev meeting. However I believe stub is also to be fixed, so that mypy won't raise an error upon encountering an Iterable.

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Feb 26, 2025

based on discussion at dev meeting on 2/26/2025: modify docs to say that any Iterable is acceptable so stubs should say the same

Hi, thank you for following up the issue in the dev meeting. However I believe stub is also to be fixed, so that mypy won't raise an error upon encountering an Iterable.

Ooops. Typo on my part. Yes, the stubs need to be modified.

cmp0xff added a commit to cmp0xff/pandas-stubs that referenced this issue Feb 26, 2025
cmp0xff added a commit to cmp0xff/pandas-stubs that referenced this issue Feb 27, 2025
@Dr-Irv Dr-Irv closed this as completed in f7a1b83 Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pandas_docs For issues where there is a conflict in behavior with pandas docs and stubs that needs resolution
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants