|
1 |
| -# Distributed-Something |
2 |
| -Run encapsulated docker containers that do... something in the Amazon Web Services (AWS) infrastructure. |
3 |
| -We are interested in scientific image analysis so we have used it for [CellProfiler](https://github.com/CellProfiler/Distributed-CellProfiler), [Fiji](https://github.com/CellProfiler/Distributed-Fiji), and [BioFormats2Raw](https://github.com/CellProfiler/Distributed-OmeZarrMaker). |
4 |
| -You can use it for whatever you want! |
5 |
| - |
6 |
| -## Documentation |
7 |
| -Full documentation is available on our [Documentation Website](https://distributedscience.github.io/Distributed-Something). |
8 |
| - |
9 |
| -## Overview |
10 |
| - |
11 |
| -This code is an example of how to use AWS distributed infrastructure for running anything Dockerized. |
12 |
| -The configuration of the AWS resources is done using boto3 and the AWS CLI. |
13 |
| -The worker is written in Python and is encapsulated in a Docker container. |
14 |
| -There are four AWS components that are minimally needed to run distributed jobs: |
15 |
| - |
16 |
| - |
17 |
| -1. An SQS queue |
18 |
| -2. An ECS cluster |
19 |
| -3. An S3 bucket |
20 |
| -4. A spot fleet of EC2 instances |
21 |
| - |
22 |
| - |
23 |
| -All of them can be managed individually through the AWS Management Console. |
24 |
| -However, this code helps to get started quickly and run a job autonomously if all the configuration is correct. |
25 |
| -The code runs a script that links all these components and prepares the infrastructure to run a distributed job. |
26 |
| -When the job is completed, the code is also able to stop resources and clean up components. |
27 |
| -It also adds logging and alarms via CloudWatch, helping the user troubleshoot runs and destroy stuck machines. |
28 |
| - |
29 |
| -## Running the code |
30 |
| - |
31 |
| -### Step 1 |
32 |
| -Edit the config.py file with all the relevant information for your job. |
33 |
| -Then, start creating the basic AWS resources by running the following script: |
| 1 | +# Distributed-HelloWorld |
34 | 2 |
|
35 |
| - $ python3 run.py setup |
36 |
| - |
37 |
| -This script initializes the resources in AWS. |
38 |
| -Notice that the docker registry is built separately and you can modify the worker code to build your own. |
39 |
| -Any time you modify the worker code, you need to update the docker registry using the Makefile script inside the worker directory. |
40 |
| - |
41 |
| -### Step 2 |
42 |
| -After the first script runs successfully, the job can now be submitted to with the following command: |
43 |
| - |
44 |
| - $ python3 run.py submitJob files/exampleJob.json |
45 |
| - |
46 |
| -Running the script uploads the tasks that are configured in the json file. |
47 |
| -You have to customize the exampleJob.json file with information that make sense for your project. |
48 |
| -You'll want to figure out which information is generic and which is the information that makes each job unique. |
49 |
| - |
50 |
| -### Step 3 |
51 |
| -After submitting the job to the queue, we can add computing power to process all tasks in AWS. |
52 |
| -This code starts a fleet of spot EC2 instances which will run the worker code. |
53 |
| -The worker code is encapsulated in Docker containers, and the code uses ECS services to inject them in EC2. |
54 |
| -All this is automated with the following command: |
55 |
| - |
56 |
| - $ python3 run.py startCluster files/exampleFleet.json |
57 |
| - |
58 |
| -After the cluster is ready, the code informs you that everything is setup, and saves the spot fleet identifier in a file for further reference. |
59 |
| - |
60 |
| -### Step 4 |
61 |
| -When the cluster is up and running, you can monitor progress using the following command: |
62 |
| - |
63 |
| - $ python3 run.py monitor files/APP_NAMESpotFleetRequestId.json |
| 3 | +[Distributed-Something](https://github.com/DistributedScience/Distributed-Something) is an app to run encapsulated docker containers that do... something in the Amazon Web Services (AWS) infrastructure. |
| 4 | +We are interested in scientific image analysis so we have used it for [CellProfiler](https://github.com/DistributedScience/Distributed-CellProfiler), [Fiji](https://github.com/DistributedScience/Distributed-Fiji), and [BioFormats2Raw](https://github.com/DistributedScience/Distributed-OmeZarrMaker). |
| 5 | +You can use it for whatever you want! |
64 | 6 |
|
65 |
| -The file APP_NAMESpotFleetRequestId.json is created after the cluster is setup in step 3. |
66 |
| -It is important to keep this monitor running if you want to automatically shutdown computing resources when there are no more tasks in the queue (recommended). |
| 7 | +Here, as an example, we have used it to make an app that lets you say hello to the world, as well as list some of your favorite things. The full code changes are available [here](https://github.com/DistributedScience/Distributed-HelloWorld/pull/1/files) |
67 | 8 |
|
68 |
| -See our [full documentation](https://distributedscience.github.io/Distributed-Something) for more information about each step of the process. |
| 9 | +Happy Distributing! |
69 | 10 |
|
70 |
| - |
| 11 | +## Documentation |
| 12 | +Full documentation is available on our [Documentation Website](https://distributedscience.github.io/Distributed-Something). |
0 commit comments