Skip to content

ENH: Avoid casting to int64/uint64 for GroupBy.sum and others before calling cython functions #48071

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
phofl opened this issue Aug 13, 2022 · 1 comment
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement Groupby NA - MaskedArrays Related to pd.NA and nullable extension arrays

Comments

@phofl
Copy link
Member

phofl commented Aug 13, 2022

Currently, we are casting the arrays to int64/uint64 in case of integer dtypes before calling the cythong functions. This happens, because there is no efficient way of compiling the cython files without creating lots of unneeded combinations of dtypes.

The out array needs either float64, float32, int64 or uint64 dtype while the input array can keep the dtype. But this creates unwanted dtype combinations when compiling. If we can avoid this, we can keep the input dtype when calling the cython op. This saves memory for small integer dtypes, cc #48059

One attempt of handling the dtype precisions was made in #48044

@phofl phofl added Enhancement Groupby NA - MaskedArrays Related to pd.NA and nullable extension arrays labels Aug 13, 2022
@jbrockmendel
Copy link
Member

i think this has been addressed?

@jbrockmendel jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement Groupby NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
Development

No branches or pull requests

3 participants