Skip to content

Conversation

@phofl
Copy link
Member

@phofl phofl commented Aug 14, 2022

This saves around 30-35% in the aggregation operation for nullable dtypes with 1/3 missing values compared to the previous implementation

cc @jorisvandenbossche

is_datetimelike=is_datetimelike,
)
elif self.how == "var":
func(
Copy link
Member Author

@phofl phofl Aug 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part needs some refactoring when every algo supports masks

@mroeschke mroeschke added Groupby NA - MaskedArrays Related to pd.NA and nullable extension arrays Reduction Operations sum, mean, min, max, etc. labels Aug 16, 2022
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm; one optional recommendation

@phofl phofl added this to the 1.6 milestone Sep 2, 2022
@phofl phofl merged commit 9c509e2 into pandas-dev:main Sep 2, 2022
@phofl phofl deleted the enh_support_mask_var_mean branch September 2, 2022 19:48
@phofl
Copy link
Member Author

phofl commented Sep 2, 2022

Thx for review @rhshadrach

@mroeschke mroeschke modified the milestones: 1.6, 2.0 Oct 13, 2022
noatamir pushed a commit to noatamir/pandas that referenced this pull request Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Groupby NA - MaskedArrays Related to pd.NA and nullable extension arrays Reduction Operations sum, mean, min, max, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants