Skip to content

Commit 4c95d02

Browse files
committed
description and one figure showing logistic regression in episode 4
1 parent b52fdaf commit 4c95d02

File tree

2 files changed

+41
-1
lines changed

2 files changed

+41
-1
lines changed

content/04-supervised-ML-classification.rst

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -304,10 +304,50 @@ We compute the confusion matrix from the trined model using the KNN algorithm, a
304304
305305
.. figure:: img/confusion-matrix-knn.png
306306
:align: center
307-
:width: 256px
307+
:width: 384px
308308

309309
The first row: there are 28 Adelie penguins in the test data, and all these penguins are identified as Adelie (valid). The second row: there are 20 Chinstrap pengunis in the test data, with 2 identified as Adelie (invalid), none are correctly recognized as Chinstrap, and 18 identified as Chinstrap (valid). The third row: there are 19 Gentoo penguins in the test data, and all these penguins are identified as Gentoo (valid).
310310

311311

312312

313+
Logistic regression
314+
^^^^^^^^^^^^^^^^^^^
315+
316+
**Logistic Regression** is a fundamental classification algorithm to predict categorical outcomes.
317+
Despite its name, logistic regression is not a regression algorithm but a classification method that predicts the probability of an instance belonging to a particular class.
318+
319+
For binary classification, it uses the logistic (**sigmoid**) function to map a linear combination of input features to a probability between 0 and 1, which is then thresholded (typically at 0.5) to assign a class.
320+
321+
For a multiclass classification, logistic regression can be extended using strategies like **one-vs-rest** (OvR) or softmax regression.
322+
- in OvR, a separate binary classifier is trained for each species against all others.
323+
- **softmax regression** generalizes the logistic function to compute probabilities across all classes simultaneously, selecting the class with the highest probability.
324+
325+
.. figure:: img/logistic-regression-example.png
326+
:align: center
327+
:width: 640px
328+
329+
1) The sigmoid function; 2) the softmax regression process: three input features to the softmax regression model resulting in three output vectors where each contains the predicted probabilities for three possible classes; 3) a bar chart of softmax outputs in which each group of bars represents the predicted probability distribution over three classes; 4-6) a binary classifier distinguishes one class from the other two classes using the one-vs-rest approach.
330+
331+
332+
```
333+
from sklearn.linear_model import LogisticRegression
334+
335+
lr_clf = LogisticRegression(random_state = 0)
336+
lr_clf.fit(X_train_scaled, y_train)
337+
338+
y_pred_lr = lr_clf.predict(X_test_scaled)
339+
340+
score_lr = accuracy_score(y_test, y_pred_lr)
341+
print("Accuracy for Logistic Regression:", score_lr )
342+
print("\nClassification Report:\n", classification_report(y_test, y_pred_lr))
343+
344+
# compute and plot confusion matrix
345+
cm_lr = confusion_matrix(y_test, y_pred_lr)
346+
plot_confusion_matrix(cm_lr, "Confusion Matrix using Logistic Regression algorithm", "confusion-matrix-lr.png")
347+
```
348+
349+
350+
351+
352+
313353

211 KB
Loading

0 commit comments

Comments
 (0)