Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -753,6 +753,7 @@ Indexing
- Bug in :meth:`DataFrame.loc` raising ``ValueError`` with ``bool`` indexer and :class:`MultiIndex` (:issue:`47687`)
- Bug in :meth:`DataFrame.__setitem__` raising ``ValueError`` when right hand side is :class:`DataFrame` with :class:`MultiIndex` columns (:issue:`49121`)
- Bug in :meth:`DataFrame.reindex` casting dtype to ``object`` when :class:`DataFrame` has single extension array column when re-indexing ``columns`` and ``index`` (:issue:`48190`)
- Bug in :meth:`DataFrame.iloc` raising ``IndexError`` when indexer is a :class:`Series` with numeric extension array dtype (:issue:`49521`)
- Bug in :func:`~DataFrame.describe` when formatting percentiles in the resulting index showed more decimals than needed (:issue:`46362`)
- Bug in :meth:`DataFrame.compare` does not recognize differences when comparing ``NA`` with value in nullable dtypes (:issue:`48939`)
-
Expand Down
5 changes: 4 additions & 1 deletion pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1481,7 +1481,10 @@ def _validate_key(self, key, axis: AxisInt):
# so don't treat a tuple as a valid indexer
raise IndexingError("Too many indexers")
elif is_list_like_indexer(key):
arr = np.array(key)
if isinstance(key, ABCSeries):
arr = key._values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about integer arrays with pd.NA?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is validated later

Copy link
Contributor

@topper-123 topper-123 Dec 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errors are different for Series and IntegerArray:

>>> import pandas as pd
>>> df = pd.DataFrame([[0,1,2,3,4],[5,6,7,8,9]])
>>> iarr = pd.array([0,1,2, pd.NA], dtype = pd.Int64Dtype())
>>> df.iloc[:, iarr]
IndexError: .iloc requires numeric indexers, got [0 1 2 <NA>]
df.iloc[:, pd.Series(iarr)]
ValueError: cannot convert to 'int64'-dtype NumPy array with missing values. Specify an appropriate 'na_value' for this dtype.

I think the ValueError should be raised in both cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you tried the series case on main? I am getting consistent errors on my pr.

In General I would prefer a better suited error, pointing to NA in iloc. But will probably do in a follow up. We will have to check this later, because not all cases get here and this does not cover the series.iloc case

Copy link
Contributor

@topper-123 topper-123 Dec 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry wrong snippet. It's the non-NA array:

>>> import pandas as pd
>>> df = pd.DataFrame([[0,1,2,3,4],[5,6,7,8,9]])
>>> iarr = pd.array([0,1,2], dtype = pd.Int64Dtype())
>>> df.iloc[:, iarr]
IndexError: .iloc requires numeric indexers, got [0 1]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah got you, fixed that as well

else:
arr = np.array(key)
len_axis = len(self.obj._get_axis(axis))

# check that the key has a numeric dtype
Expand Down
7 changes: 7 additions & 0 deletions pandas/tests/frame/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1437,6 +1437,13 @@ def test_loc_rhs_empty_warning(self):
df.loc[:, "a"] = rhs
tm.assert_frame_equal(df, expected)

def test_iloc_ea_series_indexer(self):
# GH#49521
df = DataFrame([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]])
result = df.iloc[Series([1], dtype="Int64"), Series([0, 1], dtype="Int64")]
expected = DataFrame([[5, 6]], index=[1])
tm.assert_frame_equal(result, expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty suire this would fail if a pd.NA was in the indexers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep it does, which is fine. Will add a test


@pytest.mark.parametrize("indexer", [True, (True,)])
@pytest.mark.parametrize("dtype", [bool, "boolean"])
def test_loc_bool_multiindex(self, dtype, indexer):
Expand Down