-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Resample category data with timedelta index #12169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
hmm, does appear a little buggy. you shouldn't need to specify the dtype on aggregations they are inferred. Here I think there is an embedded exception which is caught in stead of actuallly computing correctly. |
I look after #11841 as the timedelta resampling is tested a bit more there (but not enough!) |
The root cause of this issue is that, when construct Series from a dict with TimedeltaIndex as key, it will treat the value as float64. See pandas/core/series.py, from line 172 to 185 try:
if isinstance(index, DatetimeIndex):
if len(data):
# coerce back to datetime objects for lookup
data = _dict_compat(data)
data = lib.fast_multiget(data, index.astype('O'),
default=np.nan)
else:
data = np.nan
elif isinstance(index, PeriodIndex):
data = ([data.get(i, nan) for i in index]
if data else np.nan)
else:
data = lib.fast_multiget(data, index.values,
default=np.nan) I believe just change Before In [5]: fxx = d2.resample('10s', how=lambda x: (x.value_counts().index[0]))
In [6]: fxx
Out[6]:
Group_obj Group
00:00:00 A NaN
00:00:10 A NaN After In [5]: fxx = d2.resample('10s', how=lambda x: (x.value_counts().index[0]))
In [6]: fxx
Out[6]:
Group_obj Group
00:00:00 A A
00:00:10 A A |
closes pandas-dev#12169 Author: Bran Yang <[email protected]> Closes pandas-dev#12271 from BranYang/issue12169 and squashes the following commits: 4a5605f [Bran Yang] add tests to Series/test_constructors; and update whatsnew 7cf1be9 [Bran Yang] Fix pandas-dev#12169 - Resample category data with timedelta index
Hi,
I get a very strange behavior when i try to resample categorical data with and timedelta index, as compared to a datetime index.
It seems to me the aggregated result in case of using timedelta as an index for the category is always NaN.
Should this be?
Thx
PS: is there a way to specify the dtype for the aggregated columns?
The text was updated successfully, but these errors were encountered: