layout | title | tagline |
---|---|---|
page |
Create a Custom Application |
Containerize the executable |
If an image of your executable already exists, and was created by a trusted source, consider using that rather than building your own. You may find existing images on hubs such as Docker Hub or BioContainers.
This tutorial is a quick and dirty summary of how to build your own Docker image as if there is not one available for your executable. This is not meant to replace the full Docker documentation.
We will continue with the example of FastQC from the previous page.
#### Choose a source image
Prerequisite: You should have a Docker ID and
docker
should be installed on your local machine
The only dependency for FastQC
is a reasonably recent Java Runtime Environment.
Thus, most modern Linux OS-es should suffice. The SD2E Docker hub provides base
images for this. In fact, the default Dockerfile generated by apps-init
pulls
an ubuntu17
base image, which will work for our purposes. Pull it manually now
so it is in your environment:
% docker pull sd2e/base:ubuntu17
A list of available base images can be found here.
Also, now is a good time to prepare the src/
directory in the application
bundle. If your code is a python script, for example, this is where you want to
put it:
% cd ~/fastqc-app/
% mkdir src/
#### Install and test interactively
The installation process for fastqc
is extremely simple. It is good practice to
test the installation interactively, and record the steps for your Dockerfile:
# Start an interactive docker session
% docker run --rm -it sd2e/base:ubuntu17
# Update and install necessary packages
[docker] % apt-get update
[docker] % apt-get upgrade -y
[docker] % apt-get install wget -y
[docker] % apt-get install zip -y
[docker] % apt-get install default-jre -y
# Install FastQC
[docker] % wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.7.zip
[docker] % unzip fastqc_v0.11.7.zip
[docker] % rm fastqc_v0.11.7.zip
[docker] % chmod +x /FastQC/fastqc
After a bit of trial and error, the commands above are a reasonably short path
to installing the fastqc
executable. You can test it from within the docker
image to make sure it is working by, for example:
[docker] % /FastQC/fastqc -h
FastQC - A high throughput sequence QC analysis tool
SYNOPSIS
fastqc seqfile1 seqfile2 .. seqfileN
fastqc [-o output dir] [--(no)extract] [-f fastq|bam|sam]
[-c contaminant file] seqfile1 .. seqfileN
... etc.
#### Note on source code and mounting directories
In this instance, we could have downloaded the source zip file for FastQC
directly to the src/
directory of our app bundle, then mounted that directory
within the image, e.g.:
% cd ~/fastqc-app/src/
% wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.7.zip
% docker run --rm -it -v $PWD:/opt/src sd2e/base:ubuntu17
... etc.
That route is perfectly reasonable and can be followed here. In fact, if your app is a standalone python script, for example, this is the best method for including it in your Docker image.
However, some packages have very large zip
or tar.gz
files (100s of MB), and
would be cumbersome to keep in this fastqc
app bundle folder. It is up to the
app developer to find the balance between completeness of source files and
responsible disk usage.
Here, we decide to not download the source permanently. Instead, we make a record of where the source came from. For example:
% cd ~/fastqc-app/src/
% echo "Source: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.7.zip" \
>> README.md
#### Write the Dockerfile
Next, translate the steps required to install your software package into a
resonable Dockerfile
. The Dockerfile
should be located at the root directory,
~/fastqc-app/Dockerfile
:
FROM sd2e/base:ubuntu17
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install wget -y \
&& apt-get install zip -y \
&& apt-get install default-jre -y
RUN wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.7.zip \
&& unzip fastqc_v0.11.7.zip \
&& rm fastqc_v0.11.7.zip \
&& chmod +x FastQC/fastqc
ENV PATH "/FastQC/:$PATH"
# ADD config.yml /config.yml
# ADD src /opt/src
The commented lines at the end of the script were automatically added to the
template Dockerfile by the apps-init
command. They are used to add
config.yml
inside the image, and mount your src/
directory inside the image,
if required. We do not need them for this app, so they can be deleted or remain
commented out.
#### Build and test the image
Navigate to the top of the app directory, ~/fastqc-app/
, and the command to
build a new Docker image is:
% docker build -f Dockerfile --force-rm -t fastqc:0.11.7 ./
Once built, test the new image with an example command:
% docker run --rm fastqc:0.11.7 fastqc -h
- or -
% docker run --rm fastqc:0.11.7 perl /FastQC/fastqc -h
Note: Calling the complete path to executables is sometimes safer than relying on PATH environment variables
If you see the FastQC help text, the installation likely was successful. At this
time, it might be prudent to test with real data as well. Download some test
data into the ~/fastqc-app/tests/
directory:
% cd ~/fastqc-app/tests/
# Download random sample data or provide your own
% wget https://molb7621.github.io/workshop/_downloads/SP1.fq
Next, run the FastQC pipeline on the example data:
% docker run -v $PWD:/data fastqc:0.11.7 perl /FastQC/fastqc /data/SP1.fq
If successful, you should find the output files SP1_fastqc.html
and SP1_fastqc.zip
in the ~/fastqc-app/tests/
directory.
#### Put run commands in `runner_template.sh`
The final step is to put instructions in runner_template.sh
on how the app
should be run. In general, these are the same commands we used for testing above.
Most of the lines in the default file should be left alone. See the last three lines of this file for what should be added:
# Import Agave runtime extensions
. _lib/extend-runtime.sh
# Allow CONTAINER_IMAGE over-ride via local file
if [ -z "${CONTAINER_IMAGE}" ]
then
if [ -f "./_lib/CONTAINER_IMAGE" ]; then
CONTAINER_IMAGE=$(cat ./_lib/CONTAINER_IMAGE)
fi
if [ -z "${CONTAINER_IMAGE}" ]; then
echo "CONTAINER_IMAGE was not set via the app or CONTAINER_IMAGE file"
CONTAINER_IMAGE="sd2e/base:ubuntu17"
fi
fi
# Usage: container_exec IMAGE COMMAND OPTIONS
# Example: docker run centos:7 uname -a
# container_exec centos:7 uname -a
COMMAND="perl /FastQC/fastqc"
PARAMS="${fastq}"
container_exec ${CONTAINER_IMAGE} ${COMMAND} ${PARAMS}
Proceed to Deploy and Submit a Test Job
Go back to Create Custom Applications
Return to the API Documentation Overview