STRIDES
diff --git a/‎notebooks/ElasticBLAST/run_elastic_blast.ipynb
Lines changed: 47 additions & 16 deletions b/‎notebooks/ElasticBLAST/run_elastic_blast.ipynb
Lines changed: 47 additions & 16 deletions
diff --git a/‎notebooks/GWAS/GWAS_coat_color.ipynb
Lines changed: 86 additions & 6 deletions b/‎notebooks/GWAS/GWAS_coat_color.ipynb
Lines changed: 86 additions & 6 deletions
@@ -10,18 +10,41 @@
   },
   {
    "cell_type": "markdown",
-   "id": "aee3b229",
    "metadata": {},
    "source": [
-    "This notebook is based on the [this tutorial](https://blast.ncbi.nlm.nih.gov/doc/elastic-blast/quickstart-aws.html). Make sure you select a kernel with Python 3.7 for the Elastic BLAST install. One good option is `conda_mxnet_latest_p37`. "
+    "## Overview\n",
+    "This notebook helps you to run Blast in a scalable manner using AWS Batch. The script will spin up and later tear down your cluster to execute the Blast jobs. This notebook is based on the [this tutorial](https://blast.ncbi.nlm.nih.gov/doc/elastic-blast/quickstart-aws.html). Make sure you select a kernel with Python 3.7 for the Elastic BLAST install. One good option is `conda_mxnet_latest_p37`. "
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "38dfb579",
    "metadata": {},
    "source": [
-    "### 1) Install elastic blast"
+    "## Prerequisites\n",
+    "You need to make sure you have permissions use to use Cloud Formation, Batch, and SageMaker"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Learning Objectives\n",
+    "+ Learn to use Batch to scale compute jobs.\n",
+    "+ Learn how to use BLAST in the cloud."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Get Started"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Install packages"
    ]
   },
   {
@@ -31,7 +54,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!pip3 install elastic-blast"
+    "! pip3 install elastic-blast"
    ]
   },
   {
@@ -49,16 +72,16 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!elastic-blast --version\n",
-    "!elastic-blast --help"
+    "! elastic-blast --version\n",
+    "! elastic-blast --help"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "58b59cb0",
    "metadata": {},
    "source": [
-    "### 2) Optionally, create a bucket for this tutorial if one does not yet exist"
+    "### Create a bucket for this tutorial if one does not yet exist, make sure to pick a unique name"
    ]
   },
   {
@@ -68,15 +91,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!aws s3 mb s3://elasticblast-sagemaker"
+    "! aws s3 mb s3://elasticblast-sagemaker"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "449d7511",
    "metadata": {},
    "source": [
-    "### 3) Create a config file that defines the job parameters"
+    "### Create a config file that defines the job parameters"
    ]
   },
   {
@@ -86,7 +109,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!touch BDQA.ini"
+    "! touch BDQA.ini"
    ]
   },
   {
@@ -122,7 +145,7 @@
    "id": "9a9f8192",
    "metadata": {},
    "source": [
-    "### 4) Submit the job"
+    "### Submit the job"
    ]
   },
   {
@@ -132,15 +155,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!elastic-blast submit --cfg BDQA.ini"
+    "! elastic-blast submit --cfg BDQA.ini"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "9a8e7716",
    "metadata": {},
    "source": [
-    "### 5) Check results and troubleshoot"
+    "### Check results and troubleshoot"
    ]
   },
   {
@@ -153,12 +176,20 @@
     "+ Finally, to view your outputs, look at the files in your S3 output bucket, something like `aws s3 ls s3://elasticblast-sagemaker/results/BDQA/`."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Conclusions\n",
+    "Here we submited a parallel Blast job to an AWS Batch cluster using Cloud Formation to handle provisioning and tear down of resources. "
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "292947f1-5247-4da5-81bd-7fc8fc420ca4",
    "metadata": {},
    "source": [
-    "### 6) Clean up cloud resources"
+    "## Clean Up"
    ]
   },
   {
@@ -168,7 +199,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!elastic-blast delete --cfg BDQA.ini"
+    "! elastic-blast delete --cfg BDQA.ini"
    ]
   }
  ],
 
@@ -6,17 +6,46 @@
    "metadata": {},
    "source": [
     "# GWAS in the cloud\n",
+    "## Overview\n",
     "We adapted the NIH CFDE tutorial from [here](https://training.nih-cfde.org/en/latest/Bioinformatic-Analyses/GWAS-in-the-cloud/background/) and fit it to a notebook. We have greatly simplified the instructions, so if you need or want more details, look at the full tutorial to find out more.\n",
-    "Most of this notebook is bash, but expects that you are using a Python kernel, until step 3, plotting, you will need to switch your kernel to R."
+    "\n",
+    "Most of this notebook is written in Bash, but expects that you are using a Python kernel, until step 3, plotting where you will need to switch your kernel to R."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3edafe63",
+   "metadata": {},
+   "source": [
+    "## Learning Objectives\n",
+    "The goal is to learn how to execute a GWAS analysis in a cloud environment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5d7ef396",
+   "metadata": {},
+   "source": [
+    "## Prerequisites\n",
+    "+ You only need access to a Sagemaker notebook environment to run this notebook"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "39ee9668",
+   "metadata": {},
+   "source": [
+    "## Get Started"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "8fbf6304",
    "metadata": {},
    "source": [
-    "## 1. Setup\n",
-    "### Download the data\n",
+    "### Install packages and set up environment\n",
+    "\n",
+    "#### Download the data\n",
     "use %%bash to denote a bash block. You can also use '!' to denote a single bash command within a Python notebook"
    ]
   },
@@ -68,7 +97,31 @@
     "tags": []
    },
    "source": [
-    "## 1. Install dependencies"
+    "### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9f5032d7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# install mamba\n",
+    "! curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh\n",
+    "! bash Mambaforge-$(uname)-$(uname -m).sh -b -p $HOME/mambaforge"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1a5bd340",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# add to your path\n",
+    "import os\n",
+    "os.environ[\"PATH\"] += os.pathsep + os.environ[\"HOME\"]+\"/mambaforge/bin\""
    ]
   },
   {
@@ -78,6 +131,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "# install everything else\n",
     "! mamba install -y -c bioconda plink vcftools"
    ]
   },
@@ -86,7 +140,7 @@
    "id": "3de2fc4c",
    "metadata": {},
    "source": [
-    "## 2. Analyze"
+    "## Analyze"
    ]
   },
   {
@@ -266,7 +320,7 @@
    "id": "1f52e97c",
    "metadata": {},
    "source": [
-    "## 3. Plotting\n",
+    "## Plotting\n",
     "In this tutorial, plotting is done in R, so at this point you can change your kernel to R in the top right. Wait for it to say 'idle' in the bottom left, then continue. You could also plot using Python native packages and maintain the Python notebook kernel."
    ]
   },
@@ -359,6 +413,32 @@
     "\n",
     "The top associated mutation is a nonsense SNP in the gene MC1R known to control pigment production. The MC1R allele encoding yellow coat color contains a single base change (from C to T) at the 916th nucleotide."
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2f6e1ef6",
+   "metadata": {},
+   "source": [
+    "### Conclusion\n",
+    "Here we learned how to run a simple GWAS analysis in the cloud"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "044a04d8",
+   "metadata": {},
+   "source": [
+    "## Clean up\n",
+    "Make sure you shut down this VM, or delete it if you don't plan to use if further.\n",
+    "\n",
+    "You can also [delete the buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-bucket.html) if you don't want to pay for the data: `aws s3 rb s3://bucket-name --force`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c1e7be16",
+   "metadata": {},
+   "source": []
   }
  ],
  "metadata": {