|
37 | 37 | "2. Instantiating AutoMLConfig with new task type \"forecasting\" for timeseries data training, and other timeseries related settings: for this dataset we use the basic one: \"time_column_name\" \n",
|
38 | 38 | "3. Training the Model using local compute\n",
|
39 | 39 | "4. Exploring the results\n",
|
40 |
| - "5. Testing the fitted model" |
| 40 | + "5. Viewing the engineered names for featurized data and featurization summary for all raw features\n", |
| 41 | + "6. Testing the fitted model" |
41 | 42 | ]
|
42 | 43 | },
|
43 | 44 | {
|
|
126 | 127 | "cell_type": "markdown",
|
127 | 128 | "metadata": {},
|
128 | 129 | "source": [
|
129 |
| - "### Split the data to train and test\n", |
| 130 | + "### Get the train data\n", |
130 | 131 | "\n"
|
131 | 132 | ]
|
132 | 133 | },
|
|
172 | 173 | "metadata": {},
|
173 | 174 | "outputs": [],
|
174 | 175 | "source": [
|
175 |
| - "X_train = train[train['timeStamp'] < '2017-01-01']\n", |
176 |
| - "X_valid = train[train['timeStamp'] >= '2017-01-01']\n", |
| 176 | + "X_train = train\n", |
177 | 177 | "y_train = X_train.pop('demand').values\n",
|
178 |
| - "y_valid = X_valid.pop('demand').values\n", |
179 | 178 | "print(X_train.shape)\n",
|
180 |
| - "print(y_train.shape)\n", |
181 |
| - "print(X_valid.shape)\n", |
182 |
| - "print(y_valid.shape)" |
| 179 | + "print(y_train.shape)" |
183 | 180 | ]
|
184 | 181 | },
|
185 | 182 | {
|
|
198 | 195 | "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
|
199 | 196 | "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
|
200 | 197 | "|**y**|(sparse) array-like, shape = [n_samples, ], targets values.|\n",
|
201 |
| - "|**X_valid**|Data used to evaluate a model in a iteration. (sparse) array-like, shape = [n_samples, n_features]|\n", |
202 |
| - "|**y_valid**|Data used to evaluate a model in a iteration. (sparse) array-like, shape = [n_samples, ], targets values.|\n", |
| 198 | + "|**n_cross_validations**|Number of cross validation splits.|\n", |
203 | 199 | "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder. "
|
204 | 200 | ]
|
205 | 201 | },
|
|
222 | 218 | " iteration_timeout_minutes = 5,\n",
|
223 | 219 | " X = X_train,\n",
|
224 | 220 | " y = y_train,\n",
|
225 |
| - " X_valid = X_valid,\n", |
226 |
| - " y_valid = y_valid,\n", |
| 221 | + " n_cross_validations = 2,\n", |
227 | 222 | " path=project_folder,\n",
|
228 | 223 | " verbosity = logging.INFO,\n",
|
229 | 224 | " **automl_settings)"
|
|
273 | 268 | "fitted_model.steps"
|
274 | 269 | ]
|
275 | 270 | },
|
| 271 | + { |
| 272 | + "cell_type": "markdown", |
| 273 | + "metadata": {}, |
| 274 | + "source": [ |
| 275 | + "### View the engineered names for featurized data\n", |
| 276 | + "Below we display the engineered feature names generated for the featurized data using the time-series featurization." |
| 277 | + ] |
| 278 | + }, |
| 279 | + { |
| 280 | + "cell_type": "code", |
| 281 | + "execution_count": null, |
| 282 | + "metadata": {}, |
| 283 | + "outputs": [], |
| 284 | + "source": [ |
| 285 | + "fitted_model.named_steps['timeseriestransformer'].get_engineered_feature_names()" |
| 286 | + ] |
| 287 | + }, |
| 288 | + { |
| 289 | + "cell_type": "markdown", |
| 290 | + "metadata": {}, |
| 291 | + "source": [ |
| 292 | + "### View the featurization summary\n", |
| 293 | + "Below we display the featurization that was performed on different raw features in the user data. For each raw feature in the user data, the following information is displayed:-\n", |
| 294 | + "- Raw feature name\n", |
| 295 | + "- Number of engineered features formed out of this raw feature\n", |
| 296 | + "- Type detected\n", |
| 297 | + "- If feature was dropped\n", |
| 298 | + "- List of feature transformations for the raw feature" |
| 299 | + ] |
| 300 | + }, |
| 301 | + { |
| 302 | + "cell_type": "code", |
| 303 | + "execution_count": null, |
| 304 | + "metadata": {}, |
| 305 | + "outputs": [], |
| 306 | + "source": [ |
| 307 | + "fitted_model.named_steps['timeseriestransformer'].get_featurization_summary()" |
| 308 | + ] |
| 309 | + }, |
276 | 310 | {
|
277 | 311 | "cell_type": "markdown",
|
278 | 312 | "metadata": {},
|
|
0 commit comments