YarShev
diff --git a/‎README.md‎
Lines changed: 0 additions & 4 deletions b/‎README.md‎
Lines changed: 0 additions & 4 deletions
diff --git a/‎configuration.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎configuration.ipynb‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎contrib/RAPIDS/azure-ml-with-nvidia-rapids.ipynb‎
Lines changed: 32 additions & 37 deletions b/‎contrib/RAPIDS/azure-ml-with-nvidia-rapids.ipynb‎
Lines changed: 32 additions & 37 deletions
@@ -12,10 +12,6 @@ pip install azureml-sdk
 Read more detailed instructions on [how to set up your environment](./NBSETUP.md) using Azure Notebook service, your own Jupyter notebook server, or Docker.
 
 ## How to navigate and use the example notebooks?
-
-This [index](https://github.com/Azure/MachineLearningNotebooks/blob/master/index.md) should assist in navigating the Azure Machine Learning notebook samples and encourage efficient retrieval of topics and content. 
-
-
 If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, you should always run the [Configuration](./configuration.ipynb) notebook first when setting up a notebook library on a new machine or in a new environment. It configures your notebook library to connect to an Azure Machine Learning workspace, and sets up your workspace and compute to be used by many of the other examples. 
 
 If you want to...
 
@@ -103,7 +103,7 @@
       "source": [
         "import azureml.core\n",
         "\n",
-        "print(\"This notebook was created using version 1.0.65 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.0.69 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },
 
@@ -9,6 +9,13 @@
         "Licensed under the MIT License."
       ]
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/contrib/RAPIDS/azure-ml-with-nvidia-rapids/azure-ml-with-nvidia-rapids.png)"
+      ]
+    },
     {
       "cell_type": "markdown",
       "metadata": {},
@@ -20,7 +27,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "The [RAPIDS](https://www.developer.nvidia.com/rapids) suite of software libraries from NVIDIA enables the execution of end-to-end data science and analytics pipelines entirely on GPUs. In many machine learning projects, a significant portion of the model training time is spent in setting up the data; this stage of the process is known as Extraction, Transformation and Loading, or ETL. By using the DataFrame API for ETL\u00c3\u201a\u00c2\u00a0and GPU-capable ML algorithms in RAPIDS, data preparation and training models can be done in GPU-accelerated end-to-end pipelines without incurring serialization costs between the pipeline stages. This notebook demonstrates how to use NVIDIA RAPIDS to prepare data and train model\u00c2\u00a0in Azure.\n",
+        "The [RAPIDS](https://www.developer.nvidia.com/rapids) suite of software libraries from NVIDIA enables the execution of end-to-end data science and analytics pipelines entirely on GPUs. In many machine learning projects, a significant portion of the model training time is spent in setting up the data; this stage of the process is known as Extraction, Transformation and Loading, or ETL. By using the DataFrame API for ETL\u00c2\u00a0and GPU-capable ML algorithms in RAPIDS, data preparation and training models can be done in GPU-accelerated end-to-end pipelines without incurring serialization costs between the pipeline stages. This notebook demonstrates how to use NVIDIA RAPIDS to prepare data and train model\u00c3\u201a\u00c2\u00a0in Azure.\n",
         " \n",
         "In this notebook, we will do the following:\n",
         " \n",
@@ -119,8 +126,10 @@
       "outputs": [],
       "source": [
         "ws = Workspace.from_config()\n",
+        "\n",
         "# if a locally-saved configuration file for the workspace is not available, use the following to load workspace\n",
         "# ws = Workspace(subscription_id=subscription_id, resource_group=resource_group, workspace_name=workspace_name)\n",
+        "\n",
         "print('Workspace name: ' + ws.name, \n",
         "      'Azure region: ' + ws.location, \n",
         "      'Subscription id: ' + ws.subscription_id, \n",
@@ -161,7 +170,7 @@
         "if gpu_cluster_name in ws.compute_targets:\n",
         "    gpu_cluster = ws.compute_targets[gpu_cluster_name]\n",
         "    if gpu_cluster and type(gpu_cluster) is AmlCompute:\n",
-        "        print('found compute target. just use it. ' + gpu_cluster_name)\n",
+        "        print('Found compute target. Will use {0} '.format(gpu_cluster_name))\n",
         "else:\n",
         "    print(\"creating new cluster\")\n",
         "    # vm_size parameter below could be modified to one of the RAPIDS-supported VM types\n",
@@ -183,7 +192,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "The _process&#95;data.py_ script used in the step below is a slightly modified implementation of [RAPIDS E2E example](https://github.com/rapidsai/notebooks/blob/master/mortgage/E2E.ipynb)."
+        "The _process&#95;data.py_ script used in the step below is a slightly modified implementation of [RAPIDS Mortgage E2E example](https://github.com/rapidsai/notebooks-contrib/blob/master/intermediate_notebooks/E2E/mortgage/mortgage_e2e.ipynb)."
       ]
     },
     {
@@ -194,10 +203,7 @@
       "source": [
         "# copy process_data.py into the script folder\n",
         "import shutil\n",
-        "shutil.copy('./process_data.py', os.path.join(scripts_folder, 'process_data.py'))\n",
-        "\n",
-        "with open(os.path.join(scripts_folder, './process_data.py'), 'r') as process_data_script:\n",
-        "    print(process_data_script.read())"
+        "shutil.copy('./process_data.py', os.path.join(scripts_folder, 'process_data.py'))"
       ]
     },
     {
@@ -221,13 +227,6 @@
         "### Downloading Data"
       ]
     },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "<font color='red'>Important</font>: Python package progressbar2 is necessary to run the following cell. If it is not available in your environment where this notebook is running, please install it."
-      ]
-    },
     {
       "cell_type": "code",
       "execution_count": null,
@@ -237,7 +236,6 @@
         "import tarfile\n",
         "import hashlib\n",
         "from urllib.request import urlretrieve\n",
-        "from progressbar import ProgressBar\n",
         "\n",
         "def validate_downloaded_data(path):\n",
         "    if(os.path.isdir(path) and os.path.exists(path + '//names.csv')) :\n",
@@ -267,7 +265,7 @@
         "        url_format = 'http://rapidsai-data.s3-website.us-east-2.amazonaws.com/notebook-mortgage-data/{0}.tgz'\n",
         "        url = url_format.format(fileroot)\n",
         "        print(\"...Downloading file :{0}\".format(filename))\n",
-        "        urlretrieve(url, filename,show_progress)\n",
+        "        urlretrieve(url, filename)\n",
         "        pbar.finish()\n",
         "        print(\"...File :{0} finished downloading\".format(filename))\n",
         "    else:\n",
@@ -282,9 +280,7 @@
         "    so_far = 0\n",
         "    for member_info in members:\n",
         "        tar.extract(member_info,path=path)\n",
-        "        show_progress(so_far, 1, numFiles)\n",
         "        so_far += 1\n",
-        "    pbar.finish()\n",
         "    print(\"...All {0} files have been decompressed\".format(numFiles))\n",
         "    tar.close()"
       ]
@@ -324,7 +320,9 @@
         "\n",
         "# download and uncompress data in a local directory before uploading to data store\n",
         "# directory specified in src_dir parameter below should have the acq, perf directories with data and names.csv file\n",
-        "ds.upload(src_dir=path, target_path=fileroot, overwrite=True, show_progress=True)\n",
+        "\n",
+        "# ---->>>> UNCOMMENT THE BELOW LINE TO UPLOAD YOUR DATA IF NOT DONE SO ALREADY <<<<----\n",
+        "# ds.upload(src_dir=path, target_path=fileroot, overwrite=True, show_progress=True)\n",
         "\n",
         "# data already uploaded to the datastore\n",
         "data_ref = DataReference(data_reference_name='data', datastore=ds, path_on_datastore=fileroot)"
@@ -360,7 +358,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "The following code shows how to use an existing image from [Docker Hub](https://hub.docker.com/r/rapidsai/rapidsai/) that has a prebuilt conda environment named 'rapids' when creating a RunConfiguration. Note that this conda environment does not include azureml-defaults package that is required for using AML functionality like metrics tracking, model management etc. This package is automatically installed when you use 'Specify package dependencies' option and that is why it is the recommended option to create RunConfiguraiton in AML."
+        "The following code shows how to install RAPIDS using conda. The `rapids.yml` file contains the list of packages necessary to run this tutorial. **NOTE:** Initial build of the image might take up to 20 minutes as the service needs to build and cache the new image; once the image is built the subequent runs use the cached image and the overhead is minimal."
       ]
     },
     {
@@ -369,17 +367,13 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "run_config = RunConfiguration()\n",
+        "cd = CondaDependencies(conda_dependencies_file_path='rapids.yml')\n",
+        "run_config = RunConfiguration(conda_dependencies=cd)\n",
         "run_config.framework = 'python'\n",
-        "run_config.environment.python.user_managed_dependencies = True\n",
-        "run_config.environment.python.interpreter_path = '/conda/envs/rapids/bin/python'\n",
         "run_config.target = gpu_cluster_name\n",
         "run_config.environment.docker.enabled = True\n",
         "run_config.environment.docker.gpu_support = True\n",
-        "run_config.environment.docker.base_image = \"rapidsai/rapidsai:cuda9.2-runtime-ubuntu18.04\"\n",
-        "# run_config.environment.docker.base_image_registry.address = '<registry_url>' # not required if the base_image is in Docker hub\n",
-        "# run_config.environment.docker.base_image_registry.username = '<user_name>' # needed only for private images\n",
-        "# run_config.environment.docker.base_image_registry.password = '<password>' # needed only for private images\n",
+        "run_config.environment.docker.base_image = \"mcr.microsoft.com/azureml/base-gpu:intelmpi2018.3-cuda10.0-cudnn7-ubuntu16.04\"\n",
         "run_config.environment.spark.precache_packages = False\n",
         "run_config.data_references={'data':data_ref.to_config()}"
       ]
@@ -388,14 +382,14 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "#### Specify package dependencies"
+        "#### Using Docker"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "The following code shows how to list package dependencies in a conda environment definition file (rapids.yml) when creating a RunConfiguration"
+        "Alternatively, you can specify RAPIDS Docker image."
       ]
     },
     {
@@ -404,16 +398,17 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# cd = CondaDependencies(conda_dependencies_file_path='rapids.yml')\n",
-        "# run_config = RunConfiguration(conda_dependencies=cd)\n",
+        "# run_config = RunConfiguration()\n",
         "# run_config.framework = 'python'\n",
+        "# run_config.environment.python.user_managed_dependencies = True\n",
+        "# run_config.environment.python.interpreter_path = '/conda/envs/rapids/bin/python'\n",
         "# run_config.target = gpu_cluster_name\n",
         "# run_config.environment.docker.enabled = True\n",
         "# run_config.environment.docker.gpu_support = True\n",
-        "# run_config.environment.docker.base_image = \"<image>\"\n",
-        "# run_config.environment.docker.base_image_registry.address = '<registry_url>' # not required if the base_image is in Docker hub\n",
-        "# run_config.environment.docker.base_image_registry.username = '<user_name>' # needed only for private images\n",
-        "# run_config.environment.docker.base_image_registry.password = '<password>' # needed only for private images\n",
+        "# run_config.environment.docker.base_image = \"rapidsai/rapidsai:cuda9.2-runtime-ubuntu18.04\"\n",
+        "# # run_config.environment.docker.base_image_registry.address = '<registry_url>' # not required if the base_image is in Docker hub\n",
+        "# # run_config.environment.docker.base_image_registry.username = '<user_name>' # needed only for private images\n",
+        "# # run_config.environment.docker.base_image_registry.password = '<password>' # needed only for private images\n",
         "# run_config.environment.spark.precache_packages = False\n",
         "# run_config.data_references={'data':data_ref.to_config()}"
       ]
@@ -551,9 +546,9 @@
       "name": "python",
       "nbconvert_exporter": "python",
       "pygments_lexer": "ipython3",
-      "version": "3.6.6"
+      "version": "3.6.8"
     }
   },
   "nbformat": 4,
-  "nbformat_minor": 2
+  "nbformat_minor": 4
 }
Original file line number	Diff line number	Diff line change
`@@ -103,7 +103,7 @@`
`103`	`103`	`"source": [`
`104`	`104`	`"import azureml.core\n",`
`105`	`105`	`"\n",`
`106`		`- "print(\"This notebook was created using version 1.0.65 of the Azure ML SDK\")\n",`
	`106`	`+ "print(\"This notebook was created using version 1.0.69 of the Azure ML SDK\")\n",`
`107`	`107`	`"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"`
`108`	`108`	`]`
`109`	`109`	`},`