You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
-4
Original file line number
Diff line number
Diff line change
@@ -12,10 +12,6 @@ pip install azureml-sdk
12
12
Read more detailed instructions on [how to set up your environment](./NBSETUP.md) using Azure Notebook service, your own Jupyter notebook server, or Docker.
13
13
14
14
## How to navigate and use the example notebooks?
15
-
16
-
This [index](https://github.com/Azure/MachineLearningNotebooks/blob/master/index.md) should assist in navigating the Azure Machine Learning notebook samples and encourage efficient retrieval of topics and content.
17
-
18
-
19
15
If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, you should always run the [Configuration](./configuration.ipynb) notebook first when setting up a notebook library on a new machine or in a new environment. It configures your notebook library to connect to an Azure Machine Learning workspace, and sets up your workspace and compute to be used by many of the other examples.
"The [RAPIDS](https://www.developer.nvidia.com/rapids) suite of software libraries from NVIDIA enables the execution of end-to-end data science and analytics pipelines entirely on GPUs. In many machine learning projects, a significant portion of the model training time is spent in setting up the data; this stage of the process is known as Extraction, Transformation and Loading, or ETL. By using the DataFrame API for ETL\u00c3\u201a\u00c2\u00a0and GPU-capable ML algorithms in RAPIDS, data preparation and training models can be done in GPU-accelerated end-to-end pipelines without incurring serialization costs between the pipeline stages. This notebook demonstrates how to use NVIDIA RAPIDS to prepare data and train model\u00c2\u00a0in Azure.\n",
30
+
"The [RAPIDS](https://www.developer.nvidia.com/rapids) suite of software libraries from NVIDIA enables the execution of end-to-end data science and analytics pipelines entirely on GPUs. In many machine learning projects, a significant portion of the model training time is spent in setting up the data; this stage of the process is known as Extraction, Transformation and Loading, or ETL. By using the DataFrame API for ETL\u00c2\u00a0and GPU-capable ML algorithms in RAPIDS, data preparation and training models can be done in GPU-accelerated end-to-end pipelines without incurring serialization costs between the pipeline stages. This notebook demonstrates how to use NVIDIA RAPIDS to prepare data and train model\u00c3\u201a\u00c2\u00a0in Azure.\n",
24
31
"\n",
25
32
"In this notebook, we will do the following:\n",
26
33
"\n",
@@ -119,8 +126,10 @@
119
126
"outputs": [],
120
127
"source": [
121
128
"ws = Workspace.from_config()\n",
129
+
"\n",
122
130
"# if a locally-saved configuration file for the workspace is not available, use the following to load workspace\n",
" if gpu_cluster and type(gpu_cluster) is AmlCompute:\n",
164
-
" print('found compute target. just use it. ' + gpu_cluster_name)\n",
173
+
" print('Found compute target. Will use {0} '.format(gpu_cluster_name))\n",
165
174
"else:\n",
166
175
" print(\"creating new cluster\")\n",
167
176
" # vm_size parameter below could be modified to one of the RAPIDS-supported VM types\n",
@@ -183,7 +192,7 @@
183
192
"cell_type": "markdown",
184
193
"metadata": {},
185
194
"source": [
186
-
"The _process_data.py_ script used in the step below is a slightly modified implementation of [RAPIDS E2E example](https://github.com/rapidsai/notebooks/blob/master/mortgage/E2E.ipynb)."
195
+
"The _process_data.py_ script used in the step below is a slightly modified implementation of [RAPIDS Mortgage E2E example](https://github.com/rapidsai/notebooks-contrib/blob/master/intermediate_notebooks/E2E/mortgage/mortgage_e2e.ipynb)."
187
196
]
188
197
},
189
198
{
@@ -194,10 +203,7 @@
194
203
"source": [
195
204
"# copy process_data.py into the script folder\n",
"<font color='red'>Important</font>: Python package progressbar2 is necessary to run the following cell. If it is not available in your environment where this notebook is running, please install it."
229
-
]
230
-
},
231
230
{
232
231
"cell_type": "code",
233
232
"execution_count": null,
@@ -237,7 +236,6 @@
237
236
"import tarfile\n",
238
237
"import hashlib\n",
239
238
"from urllib.request import urlretrieve\n",
240
-
"from progressbar import ProgressBar\n",
241
239
"\n",
242
240
"def validate_downloaded_data(path):\n",
243
241
" if(os.path.isdir(path) and os.path.exists(path + '//names.csv')) :\n",
"The following code shows how to use an existing image from [Docker Hub](https://hub.docker.com/r/rapidsai/rapidsai/) that has a prebuilt conda environment named 'rapids' when creating a RunConfiguration. Note that this conda environment does not include azureml-defaults package that is required for using AML functionality like metrics tracking, model management etc. This package is automatically installed when you use 'Specify package dependencies' option and that is why it is the recommended option to create RunConfiguraiton in AML."
361
+
"The following code shows how to install RAPIDS using conda. The `rapids.yml` file contains the list of packages necessary to run this tutorial. **NOTE:** Initial build of the image might take up to 20 minutes as the service needs to build and cache the new image; once the image is built the subequent runs use the cached image and the overhead is minimal."
0 commit comments