Skip to content

Commit c3ce932

Browse files
committed
version 1.0.23
1 parent a956162 commit c3ce932

File tree

30 files changed

+5162
-4051
lines changed

30 files changed

+5162
-4051
lines changed

how-to-use-azureml/automated-machine-learning/README.md

+12
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,11 @@ jupyter notebook
189189
- Dataset: [Dominick's grocery sales of orange juice](forecasting-b/dominicks_OJ.csv)
190190
- Example of training an AutoML forecasting model on multiple time-series
191191

192+
- [auto-ml-classification-with-onnx.ipynb](classification-with-onnx/auto-ml-classification-with-onnx.ipynb)
193+
- Dataset: scikit learn's [digit dataset](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits)
194+
- Simple example of using Auto ML for classification with ONNX models
195+
- Uses local compute for training
196+
192197
<a name="documentation"></a>
193198
See [Configure automated machine learning experiments](https://docs.microsoft.com/azure/machine-learning/service/how-to-configure-auto-train) to learn how more about the the settings and features available for automated machine learning experiments.
194199

@@ -233,6 +238,13 @@ If a sample notebook fails with an error that property, method or library does n
233238
## Numpy import fails on Windows
234239
Some Windows environments see an error loading numpy with the latest Python version 3.6.8. If you see this issue, try with Python version 3.6.7.
235240

241+
## Numpy import fails
242+
Check the tensorflow version in the automated ml conda environment. Supported versions are < 1.13. Uninstall tensorflow from the environment if version is >= 1.13
243+
You may check the version of tensorflow and uninstall as follows
244+
1) start a command shell, activate conda environment where automated ml packages are installed
245+
2) enter `pip freeze` and look for `tensorflow` , if found, the version listed should be < 1.13
246+
3) If the listed version is a not a supported version, `pip uninstall tensorflow` in the command shell and enter y for confirmation.
247+
236248
## Remote run: DsvmCompute.create fails
237249
There are several reasons why the DsvmCompute.create can fail. The reason is usually in the error message but you have to look at the end of the error message for the detailed reason. Some common reasons are:
238250
1) `Compute name is invalid, it should start with a letter, be between 2 and 16 character, and only include letters (a-zA-Z), numbers (0-9) and \'-\'.` Note that underscore is not allowed in the name.

how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb

+1-2
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,6 @@
139139
" primary_metric = 'AUC_weighted',\n",
140140
" iteration_timeout_minutes = 20,\n",
141141
" iterations = 10,\n",
142-
" n_cross_validations = 2,\n",
143142
" verbosity = logging.INFO,\n",
144143
" X = X_train, \n",
145144
" y = y_train,\n",
@@ -263,7 +262,7 @@
263262
"cell_type": "markdown",
264263
"metadata": {},
265264
"source": [
266-
"To ensure the fit results are consistent with the training results, the SDK dependency versions need to be the same as the environment that trains the model. Details about retrieving the versions can be found in notebook [12.auto-ml-retrieve-the-training-sdk-versions](12.auto-ml-retrieve-the-training-sdk-versions.ipynb)."
265+
"To ensure the fit results are consistent with the training results, the SDK dependency versions need to be the same as the environment that trains the model. The following cells create a file, myenv.yml, which specifies the dependencies from the run."
267266
]
268267
},
269268
{
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,284 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
8+
"\n",
9+
"Licensed under the MIT License."
10+
]
11+
},
12+
{
13+
"cell_type": "markdown",
14+
"metadata": {},
15+
"source": [
16+
"# Automated Machine Learning\n",
17+
"_**Classification with Local Compute**_\n",
18+
"\n",
19+
"## Contents\n",
20+
"1. [Introduction](#Introduction)\n",
21+
"1. [Setup](#Setup)\n",
22+
"1. [Data](#Data)\n",
23+
"1. [Train](#Train)\n",
24+
"1. [Results](#Results)\n",
25+
"1. [Test](#Test)\n",
26+
"\n"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {},
32+
"source": [
33+
"## Introduction\n",
34+
"\n",
35+
"In this example we use the scikit-learn's [digit dataset](http://scikit-learn.org/stable/datasets/index.html#optical-recognition-of-handwritten-digits-dataset) to showcase how you can use AutoML for a simple classification problem.\n",
36+
"\n",
37+
"Make sure you have executed the [configuration](../../../configuration.ipynb) before running this notebook.\n",
38+
"\n",
39+
"Please find the ONNX related documentations [here](https://github.com/onnx/onnx).\n",
40+
"\n",
41+
"In this notebook you will learn how to:\n",
42+
"1. Create an `Experiment` in an existing `Workspace`.\n",
43+
"2. Configure AutoML using `AutoMLConfig`.\n",
44+
"3. Train the model using local compute with ONNX compatible config on.\n",
45+
"4. Explore the results and save the ONNX model."
46+
]
47+
},
48+
{
49+
"cell_type": "markdown",
50+
"metadata": {},
51+
"source": [
52+
"## Setup\n",
53+
"\n",
54+
"As part of the setup you have already created an Azure ML `Workspace` object. For AutoML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
55+
]
56+
},
57+
{
58+
"cell_type": "code",
59+
"execution_count": null,
60+
"metadata": {},
61+
"outputs": [],
62+
"source": [
63+
"import logging\n",
64+
"\n",
65+
"from matplotlib import pyplot as plt\n",
66+
"import numpy as np\n",
67+
"import pandas as pd\n",
68+
"from sklearn import datasets\n",
69+
"\n",
70+
"import azureml.core\n",
71+
"from azureml.core.experiment import Experiment\n",
72+
"from azureml.core.workspace import Workspace\n",
73+
"from azureml.train.automl import AutoMLConfig"
74+
]
75+
},
76+
{
77+
"cell_type": "code",
78+
"execution_count": null,
79+
"metadata": {},
80+
"outputs": [],
81+
"source": [
82+
"ws = Workspace.from_config()\n",
83+
"\n",
84+
"# Choose a name for the experiment and specify the project folder.\n",
85+
"experiment_name = 'automl-classification-onnx'\n",
86+
"project_folder = './sample_projects/automl-classification-onnx'\n",
87+
"\n",
88+
"experiment = Experiment(ws, experiment_name)\n",
89+
"\n",
90+
"output = {}\n",
91+
"output['SDK version'] = azureml.core.VERSION\n",
92+
"output['Subscription ID'] = ws.subscription_id\n",
93+
"output['Workspace Name'] = ws.name\n",
94+
"output['Resource Group'] = ws.resource_group\n",
95+
"output['Location'] = ws.location\n",
96+
"output['Project Directory'] = project_folder\n",
97+
"output['Experiment Name'] = experiment.name\n",
98+
"pd.set_option('display.max_colwidth', -1)\n",
99+
"outputDf = pd.DataFrame(data = output, index = [''])\n",
100+
"outputDf.T"
101+
]
102+
},
103+
{
104+
"cell_type": "markdown",
105+
"metadata": {},
106+
"source": [
107+
"## Data\n",
108+
"\n",
109+
"This uses scikit-learn's [load_digits](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) method."
110+
]
111+
},
112+
{
113+
"cell_type": "code",
114+
"execution_count": null,
115+
"metadata": {},
116+
"outputs": [],
117+
"source": [
118+
"digits = datasets.load_digits()\n",
119+
"\n",
120+
"# Exclude the first 100 rows from training so that they can be used for test.\n",
121+
"X_train = digits.data[100:,:]\n",
122+
"y_train = digits.target[100:]"
123+
]
124+
},
125+
{
126+
"cell_type": "markdown",
127+
"metadata": {},
128+
"source": [
129+
"## Train with enable ONNX compatible models config on\n",
130+
"\n",
131+
"Instantiate an `AutoMLConfig` object to specify the settings and data used to run the experiment.\n",
132+
"\n",
133+
"Set the parameter enable_onnx_compatible_models=True, if you also want to generate the ONNX compatible models. Please note, the forecasting task and TensorFlow models are not ONNX compatible yet.\n",
134+
"\n",
135+
"|Property|Description|\n",
136+
"|-|-|\n",
137+
"|**task**|classification or regression|\n",
138+
"|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics: <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>average_precision_score_weighted</i><br><i>norm_macro_recall</i><br><i>precision_score_weighted</i>|\n",
139+
"|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
140+
"|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n",
141+
"|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
142+
"|**y**|(sparse) array-like, shape = [n_samples, ], Multi-class targets.|\n",
143+
"|**enable_onnx_compatible_models**|Enable the ONNX compatible models in the experiment.|\n",
144+
"|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|"
145+
]
146+
},
147+
{
148+
"cell_type": "code",
149+
"execution_count": null,
150+
"metadata": {},
151+
"outputs": [],
152+
"source": [
153+
"automl_config = AutoMLConfig(task = 'classification',\n",
154+
" debug_log = 'automl_errors.log',\n",
155+
" primary_metric = 'AUC_weighted',\n",
156+
" iteration_timeout_minutes = 60,\n",
157+
" iterations = 10,\n",
158+
" verbosity = logging.INFO,\n",
159+
" X = X_train, \n",
160+
" y = y_train,\n",
161+
" enable_onnx_compatible_models=True,\n",
162+
" path = project_folder)"
163+
]
164+
},
165+
{
166+
"cell_type": "markdown",
167+
"metadata": {},
168+
"source": [
169+
"Call the `submit` method on the experiment object and pass the run configuration. Execution of local runs is synchronous. Depending on the data and the number of iterations this can run for a while.\n",
170+
"In this example, we specify `show_output = True` to print currently running iterations to the console."
171+
]
172+
},
173+
{
174+
"cell_type": "code",
175+
"execution_count": null,
176+
"metadata": {},
177+
"outputs": [],
178+
"source": [
179+
"local_run = experiment.submit(automl_config, show_output = True)"
180+
]
181+
},
182+
{
183+
"cell_type": "code",
184+
"execution_count": null,
185+
"metadata": {},
186+
"outputs": [],
187+
"source": [
188+
"local_run"
189+
]
190+
},
191+
{
192+
"cell_type": "markdown",
193+
"metadata": {},
194+
"source": [
195+
"## Results"
196+
]
197+
},
198+
{
199+
"cell_type": "markdown",
200+
"metadata": {},
201+
"source": [
202+
"#### Widget for Monitoring Runs\n",
203+
"\n",
204+
"The widget will first report a \"loading\" status while running the first iteration. After completing the first iteration, an auto-updating graph and table will be shown. The widget will refresh once per minute, so you should see the graph update as child runs complete.\n",
205+
"\n",
206+
"**Note:** The widget displays a link at the bottom. Use this link to open a web interface to explore the individual run details."
207+
]
208+
},
209+
{
210+
"cell_type": "code",
211+
"execution_count": null,
212+
"metadata": {},
213+
"outputs": [],
214+
"source": [
215+
"from azureml.widgets import RunDetails\n",
216+
"RunDetails(local_run).show() "
217+
]
218+
},
219+
{
220+
"cell_type": "markdown",
221+
"metadata": {},
222+
"source": [
223+
"### Retrieve the Best ONNX Model\n",
224+
"\n",
225+
"Below we select the best pipeline from our iterations. The `get_output` method returns the best run and the fitted model. The Model includes the pipeline and any pre-processing. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*.\n",
226+
"\n",
227+
"Set the parameter return_onnx_model=True to retrieve the best ONNX model, instead of the Python model."
228+
]
229+
},
230+
{
231+
"cell_type": "code",
232+
"execution_count": null,
233+
"metadata": {},
234+
"outputs": [],
235+
"source": [
236+
"best_run, onnx_mdl = local_run.get_output(return_onnx_model=True)"
237+
]
238+
},
239+
{
240+
"cell_type": "markdown",
241+
"metadata": {},
242+
"source": [
243+
"### Save the best ONNX model"
244+
]
245+
},
246+
{
247+
"cell_type": "code",
248+
"execution_count": null,
249+
"metadata": {},
250+
"outputs": [],
251+
"source": [
252+
"from azureml.train.automl._vendor.automl.client.core.common.onnx_convert import OnnxConverter\n",
253+
"onnx_fl_path = \"./best_model.onnx\"\n",
254+
"OnnxConverter.save_onnx_model(onnx_mdl, onnx_fl_path)"
255+
]
256+
}
257+
],
258+
"metadata": {
259+
"authors": [
260+
{
261+
"name": "savitam"
262+
}
263+
],
264+
"kernelspec": {
265+
"display_name": "Python 3.6",
266+
"language": "python",
267+
"name": "python36"
268+
},
269+
"language_info": {
270+
"codemirror_mode": {
271+
"name": "ipython",
272+
"version": 3
273+
},
274+
"file_extension": ".py",
275+
"mimetype": "text/x-python",
276+
"name": "python",
277+
"nbconvert_exporter": "python",
278+
"pygments_lexer": "ipython3",
279+
"version": "3.6.6"
280+
}
281+
},
282+
"nbformat": 4,
283+
"nbformat_minor": 2
284+
}

how-to-use-azureml/automated-machine-learning/classification-with-whitelisting/auto-ml-classification-with-whitelisting.ipynb

+12-7
Original file line numberDiff line numberDiff line change
@@ -71,11 +71,17 @@
7171
"import azureml.core\n",
7272
"from azureml.core.experiment import Experiment\n",
7373
"from azureml.core.workspace import Workspace\n",
74-
"try:\n",
75-
" import tensorflow as tf1\n",
76-
"except ImportError:\n",
77-
" from pip._internal import main\n",
78-
" main(['install', 'tensorflow>=1.10.0,<=1.12.0'])\n",
74+
"import sys\n",
75+
"whitelist_models=[\"LightGBM\"]\n",
76+
"if \"3.7\" != sys.version[0:3]:\n",
77+
" try:\n",
78+
" import tensorflow as tf1\n",
79+
" except ImportError:\n",
80+
" from pip._internal import main\n",
81+
" main(['install', 'tensorflow>=1.10.0,<=1.12.0'])\n",
82+
" logging.getLogger().setLevel(logging.ERROR)\n",
83+
" whitelist_models=[\"TensorFlowLinearClassifier\", \"TensorFlowDNN\"]\n",
84+
"\n",
7985
"from azureml.train.automl import AutoMLConfig"
8086
]
8187
},
@@ -160,12 +166,11 @@
160166
" primary_metric = 'AUC_weighted',\n",
161167
" iteration_timeout_minutes = 60,\n",
162168
" iterations = 10,\n",
163-
" n_cross_validations = 3,\n",
164169
" verbosity = logging.INFO,\n",
165170
" X = X_train, \n",
166171
" y = y_train,\n",
167172
" enable_tf=True,\n",
168-
" whitelist_models=[\"TensorFlowLinearClassifier\", \"TensorFlowDNN\"],\n",
173+
" whitelist_models=whitelist_models,\n",
169174
" path = project_folder)"
170175
]
171176
},

how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb

-2
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,6 @@
135135
"|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics: <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>average_precision_score_weighted</i><br><i>norm_macro_recall</i><br><i>precision_score_weighted</i>|\n",
136136
"|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
137137
"|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n",
138-
"|**n_cross_validations**|Number of cross validation splits.|\n",
139138
"|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
140139
"|**y**|(sparse) array-like, shape = [n_samples, ], Multi-class targets.|\n",
141140
"|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|"
@@ -152,7 +151,6 @@
152151
" primary_metric = 'AUC_weighted',\n",
153152
" iteration_timeout_minutes = 60,\n",
154153
" iterations = 25,\n",
155-
" n_cross_validations = 3,\n",
156154
" verbosity = logging.INFO,\n",
157155
" X = X_train, \n",
158156
" y = y_train,\n",

how-to-use-azureml/automated-machine-learning/dataprep/auto-ml-dataprep.ipynb

+1-2
Original file line numberDiff line numberDiff line change
@@ -163,8 +163,7 @@
163163
" \"iterations\" : 2,\n",
164164
" \"primary_metric\" : 'AUC_weighted',\n",
165165
" \"preprocess\" : False,\n",
166-
" \"verbosity\" : logging.INFO,\n",
167-
" \"n_cross_validations\": 3\n",
166+
" \"verbosity\" : logging.INFO\n",
168167
"}"
169168
]
170169
},

0 commit comments

Comments
 (0)