-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pandas In, Pandas Out? .inverse_transform()
method
#41
Comments
Hi Naught101! You can already put pandas dataframes into sklearn pipelines. Just create a pipeline where the first step is the Regarding the proposal "to have equivalent dataframes returned afterwards", you mean to make the pipeline return a pandas DataFrame? Sklearn pipelines usually return numpy arrays, with either classification probabilities for each class (predict_proba), directly class predictions or regression values. How could you inverse transform that with the initial I believe you can do the indexing thing you proposed at scikit-learn/scikit-learn#5523 (comment) just wrapping the numpy array output into a Regarding the reason why all the code is in One issue we have one is that the original maintainer of the package (paulgb) is no longer working on it at all, and the second mantainer (Cal Paterson) has been quite irresponsive in the last few months as well. So it's becoming hard to get new code into this repo, and harder to get it into a release. :( |
Aha.. I wasn't thinking clearly, but now I can: DataFrameMappers can also be useful for generating the I guess that would all happen outside the pipeline though.. Has anyone working on the code asked @paulgb for push access? On 22 October 2015 6:52:05 pm AEDT, "Israel Saeta Pérez" [email protected] wrote:
Sent from my Android device with K-9 Mail. Please excuse my brevity. |
@calpaterson got write access to this repo, but he's not answering my mails. :S |
Hrm. Is there any reason you couldn't expand the current behaviour to also map the y dataframe? e.g. the call would be |
Sounds reasonable. Could you come up with some examples where this y transformation would be useful? |
@naught101 I have write access now to this repo so we can work this out if you come out with useful use cases. :) |
@naught101 you might want something similar to what is discussed in #13 ? |
Yeah, I suspect that #13 is a prerequisite for this issue.. |
If say the transformed dataframe has exactly the same shape as the dataframe before the transformation. Can we pass in the columns to regenerate the predicted results in a DataFrame format? |
@ethanluoyc Could you provide a code example of how that feature would work? Not the implementation, but how one would use it. |
I am doing something on basketball so I will just give an exmaple on this after the conversion I will get something like this. Which basically did substitution on based on the position of the keyword (which is the name) I have in a text string, for example,
So the two dataframes actually has the same shape. I don't know whether I can do such inverse transformation. I checked out #13 and I think the approach can work, however, as I referenced on the documentation on sklearn I stumble about their docs on the attribute active_features_, I decided to look into that in more details once I figure out what teh active_features_ attribute does. |
I believe we can do the inverse transformation if we: It shouldn't be too hard to do. Any takers? :) |
Can sklearn-pands inverse_transform the transformed data right now ? |
No, it can't right now. |
Last intent to do this was #56 but it stalled waiting for input from other dev. Perhaps we can retake it? |
Am I right that this feature should be something like:
Where
And,
So, basically, the |
@devforfu yes, this is what I understand. To do so we need to keep track of which columns correspond to which features in the transformed output, and then run the transformer inverse on each block. |
Hi all, I've worked on a fork to create a solution for this problem. It passes the test
which includes a I'd like to improve this solution (I've now included an extra What would be the next steps? I've no idea if somebody else is already working on this, but I'm assuming I'll update my solution, commit it to my fork and then click on 'pull request' in my forked repository on GitHub? Do I need to keep anything else in mind? |
@erikjandevries I guess you only need to run
Or maybe any other edge cases. Then, if everything is fine, you could make a pull request and wait for a review from the repo owners. (As well as response from Circle CI which could show if your implementation has any issues). |
interested to see if there's been any progress on this issue. Seems like a pretty major limitation to not be able to recover the original data after transformation. |
is there any issue with @erikjandevries code here? looks fine to me but hasn't been accepted |
I'm very sorry, I'm busy lately with other stuff in my life and haven't
managed to review this... Would any of you be interested in becoming a
project admin with merge rights?
El dc., 11 jul. 2018 , 00:27, Whamp <[email protected]> va escriure:
… is there any issue with @erikjandevries
<https://github.com/erikjandevries> code here? looks fine to me but
hasn't been accepted
1b4edd9
<1b4edd9>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#41 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACj4QoHVafFR6lgMi9Cdmq27KbFW2OWks5uFSo7gaJpZM4GTgzi>
.
|
I'm sorry to say I've also been very busy. If I'm not mistaken the problem with my code was that I created a new variable |
@dukebody I usually track the |
@devforfu Thanks! I've sent you an invite to become collaborator with write access to this repo, so you can merge stuff. Do you have an account in Pypi so I can give you access to publish new releases there? |
@dukebody Sure, not a problem! Yes, I've created one, the username is |
@devforfu Added you to pypi. I guess you should have received some kind of notification about it. Can you take care of managing next release after working out existing PRs? |
@dukebody Yes, the notification was received. Ok, sure, will do as soon as finalize the pending changes. |
Hello guys. Any update on this issue? |
I am joining @AlanGanem: Is there any update? I can see some updates in #133 and #182 , but it's already been more than 1 year and nothing was approved and merged. |
Yes, it is a pitty, this would be a very useful feature |
It would be really nice to have the ability to put pandas dataframes into sklearn pipelines, and to have equivalent pandas dataframes returned afterwards. I think that this module would be the place for that - probably all that would be required is a
.inverse_transform
method on the DataFrameMapper.Would something like this be wanted in this module? I can make a pull request, if so.
Before I do, why is all the code in
__init__.py
? Seems like it'll get hard to maintain after a while...The text was updated successfully, but these errors were encountered: