Skip to content

ENH: pct_change in a group by return groups #53739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
marcdelabarrera opened this issue Jun 20, 2023 · 4 comments
Closed
2 of 3 tasks

ENH: pct_change in a group by return groups #53739

marcdelabarrera opened this issue Jun 20, 2023 · 4 comments
Labels

Comments

@marcdelabarrera
Copy link

marcdelabarrera commented Jun 20, 2023

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I have a dataset with date, industry, wage and price. I want to compute the percentage increase in wages and prices by industry. I can do something like:

data.set_index('date').groupby('industry')[['wage','price']].pct_change()

But this will return me a pandas dataframe with date as an index and wage and price as columns, losing the industry column.

Feature Description

It would be nice to add an option to pct_change(), when applied to a groupby object, to also return the groups, maybe as an index as the agg function does.

Alternative Solutions

My current approach is:
pd.concat([data.set_index('date')[['industry']], data.set_index('date').groupby(['industry'])[['wage','price']].pct_change()],axis=1)

Not a big deal but seems a common operation. Having a grouped dataframe and wanting to compute percentage changes for several columns.

Additional Context

No response

@marcdelabarrera marcdelabarrera added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 20, 2023
@rhshadrach
Copy link
Member

rhshadrach commented Jun 21, 2023

Does

result = data.set_index('date').groupby('industry')[['wage','price']].pct_change()
result['industry'] = data['industry']

work? Expanding the applicability of as_index=False/True to transformations is part of #49543

@rhshadrach rhshadrach added Groupby Transformations e.g. cumsum, diff, rank Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 21, 2023
@jesse-sealand
Copy link

@marcdelabarrera does @rhshadrach solution work for you?

@marcdelabarrera
Copy link
Author

Yes, it would. A two-line option but it works. Thx!

@rhshadrach
Copy link
Member

Using .assign(industry=data['industry']) would make it one line if you so prefer! Thanks for the response, closing.

@rhshadrach rhshadrach removed the Needs Info Clarification about behavior needed to assess issue label Aug 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants