|
105 | 105 | "In the following task, you are expected to train a regression model predicting the value of `petal_width` from the values of `sepal_length`. You will be training a [`linear_model.LinearRegression()`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html) model.\n",
|
106 | 106 | "\n",
|
107 | 107 | "you can do this in the following steps:\n",
|
108 |
| - "- select the feature (`sepal_length`) into the variable `X_regression`. As each sample needs to be represented by an array, even if it has a single feature, you can apply `np.array.reshape(-1, 1)` to the selected feature\n", |
109 |
| - "- select the feature to predict (`petal_width`) into the variable `y_regression`\n", |
| 108 | + "- select the first feature (`sepal_length` - column 0) into the variable `X_regression`. As each sample needs to be represented by an array, even if it has a single feature, you can apply `np.array.reshape(-1, 1)` to the selected feature\n", |
| 109 | + "- select the feature to predict (`petal_width` - column 3) into the variable `y_regression`\n", |
110 | 110 | "- separate these into training and testing sets (`X_train, X_test, y_train, y_test`) using [`sklearn.model_selection.train_test_split()`](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html), using $20\\%$ of the samples in the testing set\n",
|
111 | 111 | "- initialise the [`linear_model.LinearRegression()`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html) model and fit it to the training data\n",
|
112 | 112 | "- predict the values for the testing set and store it into `y_predicted`\n",
|
|
209 | 209 | "This exercise is similar to the previous one, except that you are expected to use all other features (`sepal_length`, `sepal_width` and `petal_length`) to predict `petal_width`.\n",
|
210 | 210 | "\n",
|
211 | 211 | "You should perform the following steps:\n",
|
212 |
| - "- select the features (`sepal_length`, `sepal_width` and `petal_length`) into the variable `X_regression`\n", |
213 |
| - "- select the feature to predict (`petal_width`) into the variable `y_regression`\n", |
| 212 | + "- select the features(`sepal_length`, `sepal_width` and `petal_length` - columns 0, 1 and 2) into the variable `X_regression`\n", |
| 213 | + "- select the feature to predict (`petal_width` - column 3) into the variable `y_regression`\n", |
214 | 214 | "- separate these into training and testing sets (`X_train, X_test, y_train, y_test`) using [`sklearn.model_selection.train_test_split()`](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html), using $20\\%$ of the samples in the testing set\n",
|
215 | 215 | "- initialise the [`linear_model.LinearRegression()`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html) model and fit it to the training data\n",
|
216 | 216 | "- predict the values for the testing set and store it into `y_predicted`\n",
|
|
297 | 297 | "In the following task, you are expected to train a **classification model** predicting the class of the iris flower from the values of `sepal_length` and `sepal_width`. You will be training a [`tree.DecisionTreeClassifier()`](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html) model.\n",
|
298 | 298 | "\n",
|
299 | 299 | "You can do this in the following steps:\n",
|
300 |
| - "- select the first two features (`sepal_length` and `sepal_width`) into the variable `X_classification`\n", |
301 |
| - "- store the labels to predict `y_classification`\n", |
| 300 | + "- select the first two features (`sepal_length` and `sepal_width` - column index 0 and 1) into the variable `X_classification`\n", |
| 301 | + "- store the labels to predict in `y_classification` (this is currently just stored in `y`)\n", |
302 | 302 | "- separate these into training and testing sets (`X_train, X_test, y_train, y_test`) using [`sklearn.model_selection.train_test_split()`](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html), using $20\\%$ of the samples in the testing set\n",
|
303 | 303 | "- initialise the [`tree.DecisionTreeClassifier()`](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html) model with `max_depth=5` and fit it to the training data\n",
|
304 | 304 | "- predict the values for the testing set and store it into `y_predicted`\n",
|
|
412 | 412 | "\n",
|
413 | 413 | "\n",
|
414 | 414 | "You can do this in the following steps:\n",
|
415 |
| - "- store all the dataset features into the variable `X_classification`\n", |
416 |
| - "- store the labels to predict into `y_classification`\n", |
| 415 | + "- store all the dataset features into the variable `X_classification` (these are currently just in `X`)\n", |
| 416 | + "- store the labels to predict into `y_classification` (there are currently just in `y`)\n", |
417 | 417 | "- separate these into training and testing sets (`X_train, X_test, y_train, y_test`) using [`sklearn.model_selection.train_test_split()`](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html), using $20\\%$ of the samples in the testing set\n",
|
418 | 418 | "- initialise the [`tree.DecisionTreeClassifier()`](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html) model with `max_depth=5` and fit it to the training data\n",
|
419 | 419 | "- predict the values for the testing set and store it into `y_predicted`\n",
|
| 420 | + "- **Note** that _again_ all the steps but the first (selecting the features) are **exactly the same as in [Exercise 3](#Exercise-3)**. You can copy-paste the rest of your solution for [Exercise 3](#Exercise-3) once you select the features.\n", |
420 | 421 | "\n",
|
421 | 422 | "The code will then evaluate your model, by calculating [`accuracy_score`](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html) and [`f1_score`](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html) metrics from your `y_test` and `y_predicted` classes.\n",
|
422 | 423 | "\n",
|
|
439 | 440 | }
|
440 | 441 | ],
|
441 | 442 | "source": [
|
| 443 | + "###################################\n", |
| 444 | + "#### Insert your solution here ####\n", |
| 445 | + "###################################\n", |
| 446 | + "\n", |
| 447 | + "\n", |
| 448 | + "\n", |
442 | 449 | "\n",
|
443 | 450 | "# Evaluation:\n",
|
444 | 451 | "print(\"accuracy on test set: {}\".format(accuracy_score(y_test, y_predicted)))\n",
|
|
0 commit comments