Skip to content

ENH Add Feature Importances to _MultiOutputEstimator for Both Classifier and Regressor #27495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

bewygs
Copy link

@bewygs bewygs commented Sep 28, 2023

This implement feature_importances_ attribute in _MultiOutputEstimator when the base estimator supports it.

Reference Issues/PRs

To my knowledge, there are no open issues that this directly addresses or closes.

What does this implement/fix? Explain your changes.

This PR adds a feature_importances_ attribute to the _MultiOutputEstimator class to accommodate both classifiers and regressors from sklearn.ensemble.

Any other comments?

I believe it would be more convenient to be able to access feature_importances_ directly from our _MultiOutputEstimator object, rather than through a loop each time, as illustrated below :

feature_importances = [estimator.feature_importances_ for estimator in self.estimators_]

Example Usage:

import numpy as np
from sklearn.multioutput import MultiOutputRegressor, MultiOutputClassifier
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor

X, y = np.random.rand(100, 3), np.random.rand(100, 2)

# For regressor
rf_regressor = RandomForestRegressor(n_estimators=10)
multi_rf_regressor = MultiOutputRegressor(rf_regressor)
multi_rf_regressor.fit(X, y)
print("Average Feature Importances for Regressor:", multi_rf_regressor.feature_importances_)

# For classifier
y_classifier = np.random.randint(0, 2, size=(100, 2))
rf_classifier = RandomForestClassifier(n_estimators=10)
multi_rf_classifier = MultiOutputClassifier(rf_classifier)
multi_rf_classifier.fit(X, y_classifier)
print("Average Feature Importances for Classifier:", multi_rf_classifier.feature_importances_)

@github-actions
Copy link

github-actions bot commented Sep 28, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 42860be. Link to the linter CI: here

@bewygs bewygs changed the title ENH: Add Feature Importances to _MultiOutputEstimator for Both Classifier and Regressor ENH Add Feature Importances to _MultiOutputEstimator for Both Classifier and Regressor Sep 28, 2023
@adrinjalali
Copy link
Member

In so many cases people shouldn't be using feature_importances_ anyway, and with all the discussions around this, I don't think we should add this.

cc @glemaitre

@glemaitre
Copy link
Member

Indeed, I think we should instead invest time in scikit-learn/enhancement_proposals#86. Once we settle on this, then we can decide to implement different feature importances that can be chosen by the user.

@bewygs bewygs deleted the feature_branch branch October 6, 2023 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants