Skip to content

Commit 95292a4

Browse files
author
ArturoAmorQ
committed
Rebase
2 parents 06e6831 + 1f5e71b commit 95292a4

File tree

1 file changed

+22
-26
lines changed

1 file changed

+22
-26
lines changed

python_scripts/trees_sol_01.py

Lines changed: 22 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,9 @@
5757
#
5858
# ```{warning}
5959
# At this time, it is not possible to use `response_method="predict_proba"` for
60-
# multiclass problems. This is a planned feature for a future version of
61-
# scikit-learn. In the mean time, you can use `response_method="predict"`
62-
# instead.
60+
# multiclass problems on a single plot. This is a planned feature for a future
61+
# version of scikit-learn. In the mean time, you can use
62+
# `response_method="predict"` instead.
6363
# ```
6464

6565
# %%
@@ -140,13 +140,15 @@
140140
# except that for a K-class problem you have K probability outputs for each
141141
# data point. Visualizing all these on a single plot can quickly become tricky
142142
# to interpret. It is then common to instead produce K separate plots, one for
143-
# each class, in a one-vs-rest (or one-vs-all) fashion.
143+
# each class, in a one-vs-rest (or one-vs-all) fashion. This can be achieved by
144+
# calling `DecisionBoundaryDisplay` several times, once for each class, and
145+
# passing the `class_of_interest` parameter to the function.
144146
#
145-
# For example, in the plot below, the first plot on the left shows in yellow the
146-
# certainty on classifying a data point as belonging to the "Adelie" class. In
147-
# the same plot, the spectre from green to purple represents the certainty of
148-
# **not** belonging to the "Adelie" class. The same logic applies to the other
149-
# plots in the figure.
147+
# For example, in the plot below, the first plot on the left shows the
148+
# certainty of classifying a data point as belonging to the "Adelie" class. The
149+
# darker the color, the more certain the model is that a given point in the
150+
# feature space belongs to a given class the predictions. The same logic
151+
# applies to the other plots in the figure.
150152

151153
# %% tags=["solution"]
152154
# import numpy as np
@@ -166,6 +168,7 @@
166168
ax=ax,
167169
vmin=0,
168170
vmax=1,
171+
cmap="Blues",
169172
)
170173
ax.scatter(
171174
data_test["Culmen Length (mm)"].loc[target_test == class_of_interest],
@@ -179,24 +182,17 @@
179182
ax.set_ylabel("Culmen Depth (mm)")
180183

181184
ax = plt.axes([0.15, 0.14, 0.7, 0.05])
182-
plt.colorbar(
183-
cm.ScalarMappable(cmap="viridis"), cax=ax, orientation="horizontal"
184-
)
185-
_ = ax.set_title("Class probability")
185+
plt.colorbar(cm.ScalarMappable(cmap="Blues"), cax=ax, orientation="horizontal")
186+
_ = ax.set_title("Predicted class membership probability")
186187

187188
# %% [markdown] tags=["solution"]
189+
#
188190
# ```{note}
189-
# You may have noticed that we are no longer using a diverging colormap. Indeed,
190-
# the chance level for a one-vs-rest binarization of the multi-class
191-
# classification problem is almost never at predicted probability of 0.5. So
192-
# using a colormap with a neutral white at 0.5 might give a false impression on
193-
# the certainty.
191+
# You may notice that we do not use a diverging colormap (2 color gradients with
192+
# white in the middle). Indeed, in a multiclass setting, 0.5 is not a
193+
# meaningful value, hence using white as the center of the colormap is not
194+
# appropriate. Instead, we use a sequential colormap, where the color intensity
195+
# indicates the certainty of the classification. The darker the color, the more
196+
# certain the model is that a given point in the feature space belongs to a
197+
# given class.
194198
# ```
195-
#
196-
# Since scikit-learn v1.4, `DecisionBoundaryDisplay` supports a
197-
# `class_of_interest` parameter that allows in particular for a visualization of
198-
# `predict_proba` in multi-class settings.
199-
#
200-
# We also plan to make it possible to visualize the `predict_proba` values for
201-
# the class with the maximum predicted probability (without having to pass a
202-
# given a fixed `class_of_interest` value).

0 commit comments

Comments
 (0)