-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataFrameMapper fit doesn't take y argument #58
Comments
Hi @vzaretsk . There has been some discussion regarding this feature already: #54 I feel I may be behaving a bit too defensive here. I believe this could be a good feature to have as long as:
Please submit the PR together with a use case of the change, so we can find out the best way to implement this y-transforming feature. |
My use case is for the shelter animal outcomes Kaggle competition. Rather than creating over 100 dummy variables for the various cat and dog breeds, I replaced each breed by the average outcome per breed. The y (outcome) information is used during fit to find this average outcome for each breed in the CV fold. Other use cases could be supervised clustering or using LDA to find the direction that best separates the classes. |
Currently DataFrameMapper fit method doesn't take a y argument. I have a use case that needs this (I'm doing supervised dimensionality reduction) and made a small modification to enable this functionality. If there is interest, I can submit a pull request with these changes. Additionally, it seems that this would eliminate the need for a custom Pipeline class.
The text was updated successfully, but these errors were encountered: