Skip to content

ENH duck-typing scikit-learn estimator instead of inheritance #858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 53 commits into from
Jan 16, 2022
Merged
Changes from 2 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
d17b6b5
add duck-type check for KNeighbors-likeness
Sep 2, 2021
379ea7e
removal ofKNeighborsMixin type check
Sep 2, 2021
8790628
Added _is_neighbors_object() private validation function
Sep 9, 2021
e997e23
Addded pep8lank lines
Sep 9, 2021
94b0725
change isinstance check for SVM estimator to simply clone the estimat…
Sep 10, 2021
9fbf360
remove explicit class-check for KMeans estimator
Sep 13, 2021
f736879
remove explicit class check for KNeighborsClassifier
Sep 13, 2021
fcb118e
remove explicit class check for KNeighborsClassifier in CondensedNear…
Sep 13, 2021
a4e959c
remove explicit class check for ClassifierMixin in InstanceHardnessTh…
Sep 13, 2021
65ae4fd
PEP 8 issue fix
Sep 13, 2021
5b76d49
PEP 8 issue fix - line break before operator
Sep 13, 2021
8284b70
PEP 8 issue fix - no more line break before operator
Sep 13, 2021
e97ae36
Undo changes to _instance_hardness_threshold
Sep 15, 2021
495ec27
revert OneSidedSelection changes
Sep 16, 2021
10456f5
Undo changes to CondensedNearestNeighbour
Sep 24, 2021
93200e1
example NearestNeighbors test
Sep 29, 2021
f104057
Use sklearn.base.clone to validate NN object and throw error
Sep 29, 2021
b82e4d9
undo last commit, and raise nn_object TypeError
Sep 29, 2021
70b6778
remove unused imports
Sep 29, 2021
c67c775
Add test for cuml ADASYN
Oct 4, 2021
010f4d5
Updated check_neighbors_object docstring and error type
Oct 4, 2021
178d0f0
Updated tests
Oct 5, 2021
9868d0f
Merge branch 'master' into ducktype-check_neighbors
NV-jpt Oct 29, 2021
2e1ee17
Merge remote-tracking branch 'origin/master' into pr/NV-jpt/858
glemaitre Dec 7, 2021
8889cfd
duck-typing svm
glemaitre Dec 7, 2021
5e875a0
TST add couple of tests
glemaitre Dec 7, 2021
9545172
better error message with duck-typing
glemaitre Dec 7, 2021
29a414b
iter
glemaitre Dec 7, 2021
12991ba
CI let's try a run on CircleCI with cuML
glemaitre Dec 7, 2021
e24ee06
iter
glemaitre Dec 7, 2021
525002f
iter
glemaitre Dec 7, 2021
189f0e9
iter
glemaitre Dec 7, 2021
2cbe273
iter
glemaitre Dec 7, 2021
cc7fae9
iter
glemaitre Dec 7, 2021
29e4619
ITER
glemaitre Dec 7, 2021
a098e84
iter
glemaitre Dec 7, 2021
0aa328e
iter
glemaitre Dec 7, 2021
8cce474
dbg
glemaitre Dec 7, 2021
8d4ff31
dbg
glemaitre Dec 8, 2021
0ceacfb
MNT move to circleci
glemaitre Dec 8, 2021
ee6b7b0
iter
glemaitre Dec 8, 2021
d089b7b
iter
glemaitre Jan 15, 2022
ac7e00a
Merge remote-tracking branch 'origin/master' into pr/NV-jpt/858
glemaitre Jan 15, 2022
d815e2d
create custom NN class
glemaitre Jan 15, 2022
964d082
add test no dependent on cupy
glemaitre Jan 16, 2022
99d5206
update documentation
glemaitre Jan 16, 2022
48d1fd5
iter
glemaitre Jan 16, 2022
18b6057
iter
glemaitre Jan 16, 2022
76fbd59
revert redirector
glemaitre Jan 16, 2022
8fa97ed
add changelog
glemaitre Jan 16, 2022
615a2bf
remove duplicated test
glemaitre Jan 16, 2022
b75b77d
make testing function private
glemaitre Jan 16, 2022
b627cf1
iter
glemaitre Jan 16, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion imblearn/utils/_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ def check_neighbors_object(nn_name, nn_object, additional_neighbor=0):
"""
if isinstance(nn_object, Integral):
return NearestNeighbors(n_neighbors=nn_object + additional_neighbor)
elif isinstance(nn_object, KNeighborsMixin):
elif hasattr(nn_object, 'kneighbors') and hasattr(nn_object, 'kneighbors_graph'):
return clone(nn_object)
else:
raise_isinstance_error(nn_name, [int, KNeighborsMixin], nn_object)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should as well change the error message since we don't strictly require to be a KNeighborsMixin but instead to expose both kneighbors and kneighbors_graph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been debating between two implementations here (my two latest commits [1 2]).

[1] uses sklearn.base.clone to verify that the nn_object is an sklearn-like estimator that can be cloned. This implementation is more consistent with how the library checks the integrity of other estimators - such as the KMeans Estimator check in KMeansSmote, but it does not protect users from the mistake of inputting other types of estimators.

[2] raises a TypeError if the nn_object is neither an integer, nor exposes both kneighbors and kneighbors_graph; thus, it protects users from this potential mistake.

Do you prefer one over the other?

Expand Down