Skip to content

Commit 44a7481

Browse files
committed
update samples from Release-141 as a part of 1.0.57 SDK release
1 parent 8f418b2 commit 44a7481

File tree

158 files changed

+32388
-612
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

158 files changed

+32388
-612
lines changed

configuration.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@
103103
"source": [
104104
"import azureml.core\n",
105105
"\n",
106-
"print(\"This notebook was created using version 1.0.55 of the Azure ML SDK\")\n",
106+
"print(\"This notebook was created using version 1.0.57 of the Azure ML SDK\")\n",
107107
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
108108
]
109109
},

how-to-use-azureml/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ As a pre-requisite, run the [configuration Notebook](../configuration.ipynb) not
88
* [train-on-local](./training/train-on-local): Learn how to submit a run to local computer and use Azure ML managed run configuration.
99
* [train-on-amlcompute](./training/train-on-amlcompute): Use a 1-n node Azure ML managed compute cluster for remote runs on Azure CPU or GPU infrastructure.
1010
* [train-on-remote-vm](./training/train-on-remote-vm): Use Data Science Virtual Machine as a target for remote runs.
11-
* [logging-api](./training/logging-api): Learn about the details of logging metrics to run history.
11+
* [logging-api](./track-and-monitor-experiments/logging-api): Learn about the details of logging metrics to run history.
1212
* [register-model-create-image-deploy-service](./deployment/register-model-create-image-deploy-service): Learn about the details of model management.
1313
* [production-deploy-to-aks](./deployment/production-deploy-to-aks) Deploy a model to production at scale on Azure Kubernetes Service.
1414
* [enable-data-collection-for-models-in-aks](./deployment/enable-data-collection-for-models-in-aks) Learn about data collection APIs for deployed model.

how-to-use-azureml/automated-machine-learning/README.md

+10-10
Original file line numberDiff line numberDiff line change
@@ -155,11 +155,11 @@ jupyter notebook
155155
- [auto-ml-subsampling-local.ipynb](subsampling/auto-ml-subsampling-local.ipynb)
156156
- How to enable subsampling
157157

158-
- [auto-ml-dataprep.ipynb](dataprep/auto-ml-dataprep.ipynb)
159-
- Using DataPrep for reading data
158+
- [auto-ml-dataset.ipynb](dataprep/auto-ml-dataset.ipynb)
159+
- Using Dataset for reading data
160160

161-
- [auto-ml-dataprep-remote-execution.ipynb](dataprep-remote-execution/auto-ml-dataprep-remote-execution.ipynb)
162-
- Using DataPrep for reading data with remote execution
161+
- [auto-ml-dataset-remote-execution.ipynb](dataprep-remote-execution/auto-ml-dataset-remote-execution.ipynb)
162+
- Using Dataset for reading data with remote execution
163163

164164
- [auto-ml-classification-with-whitelisting.ipynb](classification-with-whitelisting/auto-ml-classification-with-whitelisting.ipynb)
165165
- Dataset: scikit learn's [digit dataset](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits)
@@ -229,7 +229,7 @@ The main code of the file must be indented so that it is under this condition.
229229
2. Check that you have conda 64-bit installed rather than 32-bit. You can check this with the command `conda info`. The `platform` should be `win-64` for Windows or `osx-64` for Mac.
230230
3. Check that you have conda 4.4.10 or later. You can check the version with the command `conda -V`. If you have a previous version installed, you can update it using the command: `conda update conda`.
231231
4. On Linux, if the error is `gcc: error trying to exec 'cc1plus': execvp: No such file or directory`, install build essentials using the command `sudo apt-get install build-essential`.
232-
5. Pass a new name as the first parameter to automl_setup so that it creates a new conda environment. You can view existing conda environments using `conda env list` and remove them with `conda env remove -n <environmentname>`.
232+
5. Pass a new name as the first parameter to automl_setup so that it creates a new conda environment. You can view existing conda environments using `conda env list` and remove them with `conda env remove -n <environmentname>`.
233233

234234
## automl_setup_linux.sh fails
235235
If automl_setup_linux.sh fails on Ubuntu Linux with the error: `unable to execute 'gcc': No such file or directory`
@@ -264,13 +264,13 @@ Some Windows environments see an error loading numpy with the latest Python vers
264264
Check the tensorflow version in the automated ml conda environment. Supported versions are < 1.13. Uninstall tensorflow from the environment if version is >= 1.13
265265
You may check the version of tensorflow and uninstall as follows
266266
1) start a command shell, activate conda environment where automated ml packages are installed
267-
2) enter `pip freeze` and look for `tensorflow` , if found, the version listed should be < 1.13
268-
3) If the listed version is a not a supported version, `pip uninstall tensorflow` in the command shell and enter y for confirmation.
267+
2) enter `pip freeze` and look for `tensorflow` , if found, the version listed should be < 1.13
268+
3) If the listed version is a not a supported version, `pip uninstall tensorflow` in the command shell and enter y for confirmation.
269269

270-
## Remote run: DsvmCompute.create fails
270+
## Remote run: DsvmCompute.create fails
271271
There are several reasons why the DsvmCompute.create can fail. The reason is usually in the error message but you have to look at the end of the error message for the detailed reason. Some common reasons are:
272272
1) `Compute name is invalid, it should start with a letter, be between 2 and 16 character, and only include letters (a-zA-Z), numbers (0-9) and \'-\'.` Note that underscore is not allowed in the name.
273-
2) `The requested VM size xxxxx is not available in the current region.` You can select a different region or vm_size.
273+
2) `The requested VM size xxxxx is not available in the current region.` You can select a different region or vm_size.
274274

275275
## Remote run: Unable to establish SSH connection
276276
Automated ML uses the SSH protocol to communicate with remote DSVMs. This defaults to port 22. Possible causes for this error are:
@@ -296,4 +296,4 @@ To resolve this issue, allocate a DSVM with more memory or reduce the value spec
296296

297297
## Remote run: Iterations show as "Not Responding" in the RunDetails widget.
298298
This can be caused by too many concurrent iterations for a remote DSVM. Each concurrent iteration usually takes 100% of a core when it is running. Some iterations can use multiple cores. So, the max_concurrent_iterations setting should always be less than the number of cores of the DSVM.
299-
To resolve this issue, try reducing the value specified for the max_concurrent_iterations setting.
299+
To resolve this issue, try reducing the value specified for the max_concurrent_iterations setting.

how-to-use-azureml/automated-machine-learning/automl_env.yml

+4-1
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,13 @@ dependencies:
1313
- scikit-learn>=0.19.0,<=0.20.3
1414
- pandas>=0.22.0,<=0.23.4
1515
- py-xgboost<=0.80
16+
- pyarrow>=0.11.0
1617

1718
- pip:
1819
# Required packages for AzureML execution, history, and data preparation.
19-
- azureml-sdk[automl,explain]
20+
- azureml-defaults
21+
- azureml-train-automl
2022
- azureml-widgets
23+
- azureml-explain-model
2124
- pandas_ml
2225

how-to-use-azureml/automated-machine-learning/automl_env_mac.yml

+4-1
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,13 @@ dependencies:
1414
- scikit-learn>=0.19.0,<=0.20.3
1515
- pandas>=0.22.0,<0.23.0
1616
- py-xgboost<=0.80
17+
- pyarrow>=0.11.0
1718

1819
- pip:
1920
# Required packages for AzureML execution, history, and data preparation.
20-
- azureml-sdk[automl,explain]
21+
- azureml-defaults
22+
- azureml-train-automl
2123
- azureml-widgets
24+
- azureml-explain-model
2225
- pandas_ml
2326

how-to-use-azureml/automated-machine-learning/classification-bank-marketing/auto-ml-classification-bank-marketing.ipynb

+20-33
Original file line numberDiff line numberDiff line change
@@ -69,22 +69,17 @@
6969
"metadata": {},
7070
"outputs": [],
7171
"source": [
72-
"import json\n",
7372
"import logging\n",
7473
"\n",
7574
"from matplotlib import pyplot as plt\n",
76-
"import numpy as np\n",
7775
"import pandas as pd\n",
7876
"import os\n",
79-
"from sklearn import datasets\n",
80-
"import azureml.dataprep as dprep\n",
81-
"from sklearn.model_selection import train_test_split\n",
8277
"\n",
8378
"import azureml.core\n",
8479
"from azureml.core.experiment import Experiment\n",
8580
"from azureml.core.workspace import Workspace\n",
86-
"from azureml.train.automl import AutoMLConfig\n",
87-
"from azureml.train.automl.run import AutoMLRun"
81+
"from azureml.core.dataset import Dataset\n",
82+
"from azureml.train.automl import AutoMLConfig"
8883
]
8984
},
9085
{
@@ -155,11 +150,12 @@
155150
" # Create the cluster.\n",
156151
" compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, provisioning_config)\n",
157152
" \n",
158-
" # Can poll for a minimum number of nodes and for a specific timeout.\n",
159-
" # If no min_node_count is provided, it will use the scale settings for the cluster.\n",
160-
" compute_target.wait_for_completion(show_output = True, min_node_count = None, timeout_in_minutes = 20)\n",
153+
"print('Checking cluster status...')\n",
154+
"# Can poll for a minimum number of nodes and for a specific timeout.\n",
155+
"# If no min_node_count is provided, it will use the scale settings for the cluster.\n",
156+
"compute_target.wait_for_completion(show_output = True, min_node_count = None, timeout_in_minutes = 20)\n",
161157
" \n",
162-
" # For a more detailed view of current AmlCompute status, use get_status()."
158+
"# For a more detailed view of current AmlCompute status, use get_status()."
163159
]
164160
},
165161
{
@@ -200,11 +196,8 @@
200196
"# Set compute target to AmlCompute\n",
201197
"conda_run_config.target = compute_target\n",
202198
"conda_run_config.environment.docker.enabled = True\n",
203-
"conda_run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
204-
"\n",
205-
"dprep_dependency = 'azureml-dataprep==' + pkg_resources.get_distribution(\"azureml-dataprep\").version\n",
206199
"\n",
207-
"cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]', dprep_dependency], conda_packages=['numpy','py-xgboost<=0.80'])\n",
200+
"cd = CondaDependencies.create(conda_packages=['numpy','py-xgboost<=0.80'])\n",
208201
"conda_run_config.environment.python.conda_dependencies = cd"
209202
]
210203
},
@@ -224,11 +217,10 @@
224217
"outputs": [],
225218
"source": [
226219
"data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_train.csv\"\n",
227-
"dflow = dprep.read_csv(data, infer_column_types=True)\n",
228-
"dflow.get_profile()\n",
229-
"X_train = dflow.drop_columns(columns=['y'])\n",
230-
"y_train = dflow.keep_columns(columns=['y'], validate_column_exists=True)\n",
231-
"dflow.head()"
220+
"dataset = Dataset.Tabular.from_delimited_files(data)\n",
221+
"X_train = dataset.drop_columns(columns=['y'])\n",
222+
"y_train = dataset.keep_columns(columns=['y'], validate=True)\n",
223+
"dataset.take(5).to_pandas_dataframe()"
232224
]
233225
},
234226
{
@@ -406,7 +398,7 @@
406398
"def run(rawdata):\n",
407399
" try:\n",
408400
" data = json.loads(rawdata)['data']\n",
409-
" data = numpy.array(data)\n",
401+
" data = np.array(data)\n",
410402
" result = model.predict(data)\n",
411403
" except Exception as e:\n",
412404
" result = str(e)\n",
@@ -443,7 +435,7 @@
443435
"metadata": {},
444436
"outputs": [],
445437
"source": [
446-
"for p in ['azureml-train-automl', 'azureml-sdk', 'azureml-core']:\n",
438+
"for p in ['azureml-train-automl', 'azureml-core']:\n",
447439
" print('{}\\t{}'.format(p, dependencies[p]))"
448440
]
449441
},
@@ -453,10 +445,8 @@
453445
"metadata": {},
454446
"outputs": [],
455447
"source": [
456-
"from azureml.core.conda_dependencies import CondaDependencies\n",
457-
"\n",
458448
"myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn','py-xgboost<=0.80'],\n",
459-
" pip_packages=['azureml-sdk[automl]'])\n",
449+
" pip_packages=['azureml-train-automl'])\n",
460450
"\n",
461451
"conda_env_file_name = 'myenv.yml'\n",
462452
"myenv.save_to_file('.', conda_env_file_name)"
@@ -476,7 +466,7 @@
476466
" content = cefr.read()\n",
477467
"\n",
478468
"with open(conda_env_file_name, 'w') as cefw:\n",
479-
" cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-sdk']))\n",
469+
" cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-train-automl']))\n",
480470
"\n",
481471
"# Substitute the actual model id in the script file.\n",
482472
"\n",
@@ -618,8 +608,6 @@
618608
"outputs": [],
619609
"source": [
620610
"# Load the bank marketing datasets.\n",
621-
"from sklearn.datasets import load_diabetes\n",
622-
"from sklearn.model_selection import train_test_split\n",
623611
"from numpy import array"
624612
]
625613
},
@@ -630,11 +618,10 @@
630618
"outputs": [],
631619
"source": [
632620
"data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_validate.csv\"\n",
633-
"dflow = dprep.read_csv(data, infer_column_types=True)\n",
634-
"dflow.get_profile()\n",
635-
"X_test = dflow.drop_columns(columns=['y'])\n",
636-
"y_test = dflow.keep_columns(columns=['y'], validate_column_exists=True)\n",
637-
"dflow.head()"
621+
"dataset = Dataset.Tabular.from_delimited_files(data)\n",
622+
"X_test = dataset.drop_columns(columns=['y'])\n",
623+
"y_test = dataset.keep_columns(columns=['y'], validate=True)\n",
624+
"dataset.take(5).to_pandas_dataframe()"
638625
]
639626
},
640627
{

how-to-use-azureml/automated-machine-learning/classification-bank-marketing/auto-ml-classification-bank-marketing.yml

+2
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@ name: auto-ml-classification-bank-marketing
22
dependencies:
33
- pip:
44
- azureml-sdk
5+
- azureml-defaults
6+
- azureml-explain-model
57
- azureml-train-automl
68
- azureml-widgets
79
- matplotlib

how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb

+16-21
Original file line numberDiff line numberDiff line change
@@ -74,14 +74,12 @@
7474
"from matplotlib import pyplot as plt\n",
7575
"import pandas as pd\n",
7676
"import os\n",
77-
"from sklearn.model_selection import train_test_split\n",
78-
"import azureml.dataprep as dprep\n",
7977
"\n",
8078
"import azureml.core\n",
8179
"from azureml.core.experiment import Experiment\n",
8280
"from azureml.core.workspace import Workspace\n",
83-
"from azureml.train.automl import AutoMLConfig\n",
84-
"from azureml.train.automl.run import AutoMLRun"
81+
"from azureml.core.dataset import Dataset\n",
82+
"from azureml.train.automl import AutoMLConfig"
8583
]
8684
},
8785
{
@@ -152,11 +150,12 @@
152150
" # Create the cluster.\n",
153151
" compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, provisioning_config)\n",
154152
" \n",
155-
" # Can poll for a minimum number of nodes and for a specific timeout.\n",
156-
" # If no min_node_count is provided, it will use the scale settings for the cluster.\n",
157-
" compute_target.wait_for_completion(show_output = True, min_node_count = None, timeout_in_minutes = 20)\n",
158-
" \n",
159-
" # For a more detailed view of current AmlCompute status, use get_status()."
153+
"print('Checking cluster status...')\n",
154+
"# Can poll for a minimum number of nodes and for a specific timeout.\n",
155+
"# If no min_node_count is provided, it will use the scale settings for the cluster.\n",
156+
"compute_target.wait_for_completion(show_output = True, min_node_count = None, timeout_in_minutes = 20)\n",
157+
"\n",
158+
"# For a more detailed view of current AmlCompute status, use get_status()."
160159
]
161160
},
162161
{
@@ -197,11 +196,8 @@
197196
"# Set compute target to AmlCompute\n",
198197
"conda_run_config.target = compute_target\n",
199198
"conda_run_config.environment.docker.enabled = True\n",
200-
"conda_run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
201-
"\n",
202-
"dprep_dependency = 'azureml-dataprep==' + pkg_resources.get_distribution(\"azureml-dataprep\").version\n",
203199
"\n",
204-
"cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]', dprep_dependency], conda_packages=['numpy','py-xgboost<=0.80'])\n",
200+
"cd = CondaDependencies.create(conda_packages=['numpy','py-xgboost<=0.80'])\n",
205201
"conda_run_config.environment.python.conda_dependencies = cd"
206202
]
207203
},
@@ -211,7 +207,7 @@
211207
"source": [
212208
"### Load Data\n",
213209
"\n",
214-
"Here create the script to be run in azure compute for loading the data, load the credit card dataset into cards and store the Class column (y) in the y variable and store the remaining data in the x variable. Next split the data using train_test_split and return X_train and y_train for training the model."
210+
"Here create the script to be run in azure compute for loading the data, load the credit card dataset into cards and store the Class column (y) in the y variable and store the remaining data in the x variable. Next split the data using random_split and return X_train and y_train for training the model."
215211
]
216212
},
217213
{
@@ -221,10 +217,9 @@
221217
"outputs": [],
222218
"source": [
223219
"data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/creditcard.csv\"\n",
224-
"dflow = dprep.read_csv(data, infer_column_types=True)\n",
225-
"dflow.get_profile()\n",
226-
"X = dflow.drop_columns(columns=['Class'])\n",
227-
"y = dflow.keep_columns(columns=['Class'], validate_column_exists=True)\n",
220+
"dataset = Dataset.Tabular.from_delimited_files(data)\n",
221+
"X = dataset.drop_columns(columns=['Class'])\n",
222+
"y = dataset.keep_columns(columns=['Class'], validate=True)\n",
228223
"X_train, X_test = X.random_split(percentage=0.8, seed=223)\n",
229224
"y_train, y_test = y.random_split(percentage=0.8, seed=223)"
230225
]
@@ -447,7 +442,7 @@
447442
"metadata": {},
448443
"outputs": [],
449444
"source": [
450-
"for p in ['azureml-train-automl', 'azureml-sdk', 'azureml-core']:\n",
445+
"for p in ['azureml-train-automl', 'azureml-core']:\n",
451446
" print('{}\\t{}'.format(p, dependencies[p]))"
452447
]
453448
},
@@ -458,7 +453,7 @@
458453
"outputs": [],
459454
"source": [
460455
"myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn','py-xgboost<=0.80'],\n",
461-
" pip_packages=['azureml-sdk[automl]'])\n",
456+
" pip_packages=['azureml-train-automl'])\n",
462457
"\n",
463458
"conda_env_file_name = 'myenv.yml'\n",
464459
"myenv.save_to_file('.', conda_env_file_name)"
@@ -478,7 +473,7 @@
478473
" content = cefr.read()\n",
479474
"\n",
480475
"with open(conda_env_file_name, 'w') as cefw:\n",
481-
" cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-sdk']))\n",
476+
" cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-train-automl']))\n",
482477
"\n",
483478
"# Substitute the actual model id in the script file.\n",
484479
"\n",

how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.yml

+2
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@ name: auto-ml-classification-credit-card-fraud
22
dependencies:
33
- pip:
44
- azureml-sdk
5+
- azureml-defaults
6+
- azureml-explain-model
57
- azureml-train-automl
68
- azureml-widgets
79
- matplotlib

how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb

+3-3
Original file line numberDiff line numberDiff line change
@@ -297,7 +297,7 @@
297297
"metadata": {},
298298
"outputs": [],
299299
"source": [
300-
"for p in ['azureml-train-automl', 'azureml-sdk', 'azureml-core']:\n",
300+
"for p in ['azureml-train-automl', 'azureml-core']:\n",
301301
" print('{}\\t{}'.format(p, dependencies[p]))"
302302
]
303303
},
@@ -310,7 +310,7 @@
310310
"from azureml.core.conda_dependencies import CondaDependencies\n",
311311
"\n",
312312
"myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn','py-xgboost<=0.80'],\n",
313-
" pip_packages=['azureml-sdk[automl]'])\n",
313+
" pip_packages=['azureml-train-automl'])\n",
314314
"\n",
315315
"conda_env_file_name = 'myenv.yml'\n",
316316
"myenv.save_to_file('.', conda_env_file_name)"
@@ -330,7 +330,7 @@
330330
" content = cefr.read()\n",
331331
"\n",
332332
"with open(conda_env_file_name, 'w') as cefw:\n",
333-
" cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-sdk']))\n",
333+
" cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-train-automl']))\n",
334334
"\n",
335335
"# Substitute the actual model id in the script file.\n",
336336
"\n",

0 commit comments

Comments
 (0)