|
92 | 92 | "\n",
|
93 | 93 | "# choose a name for experiment\n",
|
94 | 94 | "experiment_name = 'automl-classification-ccard'\n",
|
95 |
| - "# project folder\n", |
96 |
| - "project_folder = './sample_projects/automl-classification-creditcard'\n", |
97 | 95 | "\n",
|
98 | 96 | "experiment=Experiment(ws, experiment_name)\n",
|
99 | 97 | "\n",
|
|
103 | 101 | "output['Workspace'] = ws.name\n",
|
104 | 102 | "output['Resource Group'] = ws.resource_group\n",
|
105 | 103 | "output['Location'] = ws.location\n",
|
106 |
| - "output['Project Directory'] = project_folder\n", |
107 | 104 | "output['Experiment Name'] = experiment.name\n",
|
108 | 105 | "pd.set_option('display.max_colwidth', -1)\n",
|
109 | 106 | "outputDf = pd.DataFrame(data = output, index = [''])\n",
|
|
164 | 161 | "source": [
|
165 | 162 | "# Data\n",
|
166 | 163 | "\n",
|
167 |
| - "Here load the data in the get_data script to be utilized in azure compute. To do this, first load all the necessary libraries and dependencies to set up paths for the data and to create the conda_run_config." |
168 |
| - ] |
169 |
| - }, |
170 |
| - { |
171 |
| - "cell_type": "code", |
172 |
| - "execution_count": null, |
173 |
| - "metadata": {}, |
174 |
| - "outputs": [], |
175 |
| - "source": [ |
176 |
| - "if not os.path.isdir('data'):\n", |
177 |
| - " os.mkdir('data')\n", |
178 |
| - " \n", |
179 |
| - "if not os.path.exists(project_folder):\n", |
180 |
| - " os.makedirs(project_folder)" |
| 164 | + "Create a run configuration for the remote run." |
181 | 165 | ]
|
182 | 166 | },
|
183 | 167 | {
|
|
207 | 191 | "source": [
|
208 | 192 | "### Load Data\n",
|
209 | 193 | "\n",
|
210 |
| - "Here create the script to be run in azure compute for loading the data, load the credit card dataset into cards and store the Class column (y) in the y variable and store the remaining data in the x variable. Next split the data using random_split and return X_train and y_train for training the model." |
| 194 | + "Load the credit card dataset into X and y. X contains the features, which are inputs to the model. y contains the labels, which are the expected output of the model. Next split the data using random_split and return X_train and y_train for training the model." |
211 | 195 | ]
|
212 | 196 | },
|
213 | 197 | {
|
|
241 | 225 | "|**n_cross_validations**|Number of cross validation splits.|\n",
|
242 | 226 | "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
|
243 | 227 | "|**y**|(sparse) array-like, shape = [n_samples, ], Multi-class targets.|\n",
|
244 |
| - "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|\n", |
245 | 228 | "\n",
|
246 | 229 | "**_You can find more information about primary metrics_** [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#primary-metric)"
|
247 | 230 | ]
|
|
270 | 253 | "}\n",
|
271 | 254 | "\n",
|
272 | 255 | "automl_config = AutoMLConfig(task = 'classification',\n",
|
273 |
| - " debug_log = 'automl_errors_20190417.log',\n", |
274 |
| - " path = project_folder,\n", |
| 256 | + " debug_log = 'automl_errors.log',\n", |
275 | 257 | " run_configuration=conda_run_config,\n",
|
276 | 258 | " X = X_train,\n",
|
277 | 259 | " y = y_train,\n",
|
|
0 commit comments