Skip to content

Commit 7512820

Browse files
committed
FIX: SMOTENC should use half of the median of the std. dev. (#491)
1 parent 46f8efa commit 7512820

File tree

3 files changed

+18
-4
lines changed

3 files changed

+18
-4
lines changed

doc/over_sampling.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -188,9 +188,9 @@ features or a boolean mask marking these features::
188188
>>> print(sorted(Counter(y_resampled).items()))
189189
[(0, 30), (1, 30)]
190190
>>> print(X_resampled[-5:])
191-
[['B' 0.1989993778979113 0]
192-
['A' -0.3657680728116921 1]
193-
['B' 0.8790828729585258 0]
191+
[['B' 0.5246469549655818 0]
192+
['A' -0.3657680728116921 0]
193+
['B' 0.9344237230779993 0]
194194
['A' 0.3710891618824609 0]
195195
['A' 0.3327240726719727 0]]
196196

doc/whats_new/v0.0.4.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,19 @@
11
.. _changes_0_4:
22

3+
Version 0.4.2
4+
=============
5+
6+
Changelog
7+
---------
8+
9+
Bug fixes
10+
.........
11+
12+
- Fix a bug in :class:`imblearn.over_sampling.SMOTENC` in which the the median
13+
of the standard deviation instead of half of the median of the standard
14+
deviation.
15+
By :user:`Guillaume Lemaitre <glemaitre>` in :issue:`491`.
16+
317
Version 0.4
418
===========
519

imblearn/over_sampling/_smote.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -974,7 +974,7 @@ def _fit_resample(self, X, y):
974974
# distance is computed between 2 samples, the difference will be equal
975975
# to the median of the standard deviation as in the original paper.
976976
X_ohe.data = (np.ones_like(X_ohe.data, dtype=X_ohe.dtype) *
977-
self.median_std_)
977+
self.median_std_ / 2)
978978
X_encoded = sparse.hstack((X_continuous, X_ohe), format='csr')
979979

980980
X_resampled, y_resampled = super(SMOTENC, self)._fit_resample(

0 commit comments

Comments
 (0)