jupyter-guide
diff --git a/‎README.md
Lines changed: 25 additions & 6 deletions b/‎README.md
Lines changed: 25 additions & 6 deletions
diff --git a/‎example1/0-Workflow.ipynb
Lines changed: 30 additions & 14 deletions b/‎example1/0-Workflow.ipynb
Lines changed: 30 additions & 14 deletions
@@ -5,16 +5,35 @@ This repository is an adjunct to the "Ten Simple Rules for Reproducible Research
 The example notebooks demonstrate some of rules. 
 
 ## Example 1
-This example demonstrates a 4-step workflow for predicting the protein fold type using a Machine Learning approach.
+This example demonstrates a reproducible 4-step workflow for predicting a protein fold classification using a Machine Learning approach.
 
-You can launch the top level notebook directly in your web browser: [0-Workflow.ipynb](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F0-Workflow.ipynb).
+---
 
-Then follow the steps in the notebook to run the 4 steps of the workflow.
+**Rule 8: Prepare Your Notebooks to Be Read, Run, and Explored.** The nbviewer links provide to a non-interactive preview of notebooks and ![Binder](https://mybinder.org/badge.svg) buttons launch
+notebooks in your web browser using the Binder ([mybinder.org](https://mybinder.org/)) server (may be slow!). All notebooks can also be launched directly from the links in the 0-Workflow.ipynb top-level notebook.
+
+---
+
+| Nbviewer | Jupyter Notebook | Jupyter Lab | PDF |
+| ---      | --               | ---         | --- |
+| [0-Workflow.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F0-Workflow.ipynb) | pdf |
+| [1-CreateDataset.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/1-CreateDataset.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F1-CreateDataset.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F1-CreateDataset.ipynb) | pdf |
+| [2-CalculateFeatures.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/2-CalculateFeatures) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F2-CalculateFeatures.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F2-CalculateFeatures.ipynb) | pdf |
+| [3-FitModel.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/3-FitModel) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F3-FitModel.ipynb) |[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F3-FitModel.ipynb)  | pdf |
+| [4-Predict.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/4-Predict.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F4-Predict.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F4-Predict.ipynb)| pdf |
+
+---
+
+**Rule 7: Share Your Data and Explain How to Use It.** To enable reproducibility, we provide a example1/data directory with all data required to run the workflow. A description of the data with download location and download date is [available](./example1/data/Datasets.md).
+
+---
 
 ## Example 2
 
+Example 2 goes here ...
+
 
-## How do I run a Juypter Notebook from this site?
-The Jupyter notebook links on this page launch the notebooks in your web browser without software installation using Binder ([mybinder.org](https://mybinder.org/)), an experimental platform for reproducible research (The Binder servers can be slow or may not be available).
+## How do I run Notebooks from this Site?
+The Launch Binder links on this page launch notebooks in your web browser without software installation using Binder ([mybinder.org](https://mybinder.org/)), an experimental platform for reproducible research (The Binder servers may be slow or not available intermittently).
 
-After you click on a notebook link above, you see a spinning Binder logo. Wait until the notebook launches (this may take a few minutes). Then click the Run ">>" button to execute the cells in the notebook.
+After you click on a launch link above, you see a spinning Binder logo. Wait until the notebook launches (this may take a few minutes). Then click the Run ">>" button to execute the cells in the notebook.
@@ -11,15 +11,15 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**The notebooks in this directory were developed to demonstrate the \"Ten Rules for Reproducible Research with Jupyter Notebooks\". Throughout the notebooks we mention the rules we applied.**\n",
+    "**The notebooks in this directory were developed to demonstrate the \"Ten Rules for Reproducible Research with Jupyter Notebooks\". Throughout the notebooks we refer to some the rules we applied.**\n",
     "\n",
     "**For example, this notebook demonstrates:**\n",
     "\n",
     "---\n",
     "\n",
     "**Rule 1: Tell a Story for a Specific Audience.** This notebook was developed for biologists to learn how to apply a simple machine learning model to protein sequences.\n",
     "\n",
-    "**Rule 3: Document the Entire Workflow.** This top-level notebook links to 3 notebooks that represent the steps of a workflow. This modularity makes it easy to replace one of the steps, for example, use a different method to calculate features or apply a different machine learning model.\n",
+    "**Rule 3: Document the Entire Workflow.** This top-level notebook links to 4 notebooks that represent the steps of a workflow. This modularity makes it easy to replace one of the steps, for example, use a different method to calculate features or apply a different machine learning model.\n",
     "\n",
     "---"
    ]
@@ -47,15 +47,15 @@
     "We can classify proteins into three major fold types based on their predominant secondary structure content\n",
     "* alpha: contains predominantly alpha helices\n",
     "* beta: contains predominantly beta sheets\n",
-    "* alpha+beta: contains both alpha helices and beta sheets"
+    "* alpha+beta: contains alpha helices and beta sheets"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "## Goal\n",
-    "This notebook serves as an example of using machine learning techniques applied to protein sequences. The goal is to create a simple machine learning model to predict the fold type of a protein given its protein sequence. We train the model on a representative set of 3D structure from the Protein Data Bank.\n",
+    "This notebook demostrates how to create a reproducible record to create a machine learning model. We train a simple model to predict the fold class of a protein given its protein sequence using a representative set of 3D structures from the Protein Data Bank.\n",
     "\n",
     "Run the following notebooks to work through this example."
    ]
@@ -71,7 +71,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "First, we need to create a dataset with protein secondary structure information obtained from 3D protein chains.\n",
+    "First, we create a dataset with protein secondary structure information obtained from 3D protein chains.\n",
     "\n",
     "Run the following notebook to extract secondary structure information from a representative set of protein chains downloaded from the RCSB Protein Data Bank and assign a fold type to each protein chain."
    ]
@@ -87,7 +87,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The notebook saves the dataset in the file `secondaryStructure.json`."
+    "The notebook saves the dataset in the file `./intermediate_data/foldClassification.json`."
    ]
   },
   {
@@ -117,7 +117,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The notebook saves the dateset in the file `features.json`."
+    "This notebook saves the dataset with feature vectors in the file `./intermediate_data/features.json`."
    ]
   },
   {
@@ -131,7 +131,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Next, we fit a 3-state classification model using the feature vectors as inputs and the known fold types from the Protein Data Bank dataset.\n",
+    "Next, we fit a 3-state classification model using the feature vectors and the given fold classification from the Protein Data Bank dataset.\n",
     "\n",
     "Run the following notebook to fit a machine learning model on a training set and evaluate its performance on a test set."
    ]
@@ -143,6 +143,13 @@
     "[3-FitModel.ipynb](./3-FitModel.ipynb)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebook saves the classification model in the file `./intermediate_data/classifier`."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -154,7 +161,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Finally, we use the Word2Vec model and the trained classifier to predict the fold class from a protein sequence."
+    "Finally, we use the trained classifier to predict the fold class from a protein sequence."
    ]
   },
   {
@@ -184,19 +191,17 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 1,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "The watermark extension is already loaded. To reload it, use:\n",
-      "  %reload_ext watermark\n",
       "CPython 3.6.3\n",
       "IPython 6.3.1\n",
       "\n",
-      "gensim 3.6.0\n",
+      "ipywidgets 7.4.0\n",
       "matplotlib 2.2.2\n",
       "numpy 1.14.5\n",
       "pandas 0.22.0\n",
@@ -214,7 +219,18 @@
    ],
    "source": [
     "%load_ext watermark\n",
-    "%watermark -v -m -p gensim,matplotlib,numpy,pandas,sklearn"
+    "%watermark -v -m -p ipywidgets,matplotlib,numpy,pandas,sklearn"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "**Authors:** Peter W. Rose, Shih-Cheng Huang, UC San Diego, October 1, 2018\n",
+    "\n",
+    "---"
    ]
   }
  ],