Skip to content

Commit

Permalink
Ansible Controller and Restructuring (#14)
Browse files Browse the repository at this point in the history
* [WIP] ansible controller: untested implementation of setup and config; need to solve repo forking first before testing

* Fixing setup and SSH keygen. Still WIP

* replace deprecated ec2 role with ec2_instance

* WIP bugfixing aws-controller

* WIP bugfixing aws-controller, updating git/aws key generation

* Fix regression bugs from rebase (mainly the tagging of instances)

* Fix awscli, boto3, and ansible installation

* Add untested roles to install seal and abc

* Restructure repo as discussed in #7

* Removed deprecated examples

* Fix demo example

* Update repotemplate for project dir environment variable

* Fix result storage

* Remove pps specific roles

* Fix minor bugs from restructuring

* Fix bug from rebase

Co-authored-by: Miro Haller <[email protected]>
  • Loading branch information
Miro-H and Miro-H authored Oct 8, 2021
1 parent d1008ac commit bbf139a
Show file tree
Hide file tree
Showing 71 changed files with 658 additions and 96 deletions.
5 changes: 1 addition & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
results/
experiments/state/

.ipynb_checkpoints

inventory/aws_ec2.yml
src/inventory/aws_ec2.yml

### Ansible ###
*.retry
Expand Down
2 changes: 1 addition & 1 deletion Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ pyinputplus = "*"
argparse = "*"

[requires]
python_version = "3.8"
python_version = "3.9"
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ After completing the getting started section, it should be possible to run the [
7. Run the repository initialization helper script and configure the experiment suite and the example host types `client` and `server`.
This prompts user input to perform variable substitution in the `group_vars/*/main.yml.j2` variable templates for the groups [all](group_vars/all/main.yml.j2), [client](group_vars/all/client.yml.j2), and [server](group_vars/all/server.yml.j2).
This prompts user input to perform variable substitution using the `resources/repotemplate/group_vars/*/main.yml.j2` templates. By default, it creates four groups: `all`, `server`, `client`, and `ansible_controller`.
When unsure, set the unique `project id` and the AWS `key name` from the prerequisites and otherwise use the default options.
Expand Down Expand Up @@ -392,8 +392,7 @@ pipenv run ansible-playbook experiment.yml -e "suite=example id=last"
### Cleaning up AWS
By default, after an experiment is complete, all resources created on AWS are terminated.
By default, after an experiment is complete, all _experiment_ resources created on AWS are terminated.
To deactivate this default behavior, provide the flag: `awsclean=false`.
Creating resources on AWS and setting up the environment takes a considerable amount of time. So, for debugging and short experiments, it can make sense not to terminate the instances. If you use this flag, be sure to check that instances are terminated when you are done.
Expand All @@ -408,6 +407,8 @@ Furthermore, we also provide a playbook to terminate all AWS resources:
pipenv run ansible-playbook clear.yml
```
:warning: The ansible controller instance, if used, is not removed. It is intended to be left running and trigger individual experiment runs. To remove it, use the flag `awscleanall=true`.
### Experimental Results
The experiment suite creates a matching folder structure on the localhost and the remote EC2 instances.
Expand Down
4 changes: 2 additions & 2 deletions ansible.cfg
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
[defaults]
INVENTORY = inventory
INVENTORY = src/inventory
roles_path = ${PWD}/src/roles:${DOES_PROJECT_DIR}/does_config/roles
ANSIBLE_CALLBACKS_ENABLED = community.general.selective
inventory_ignore_extensions = ~, .orig, .bak, .ini, .cfg, .retry, .pyc, .pyo, .j2

ansible_ssh_common_args = '-o StrictHostKeyChecking=no -o ForwardAgent=yes'


# TODO: activate / deactivate to only show the pretty log output
stdout_callback = community.general.selective
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
7 changes: 7 additions & 0 deletions demo_project/does_config/group_vars/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# The following files are ignored because they are generated by the repotemplate.py script.
# For your own experiments, remove this .gitignore file to commit hand-written experiment configurations.
all
ansible_controller
client
server
small
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This document contains some errors that may occur and possible reasons for them.
>
### Possible reason
Note that every host type defines `ec2_image` in the [group_vars](../group_vars). This instance ID is different in different AWS regions. It will also change with image updates (e.g., when Amazon updates the Ubuntu version of their standard image).
Note that every host type defines `ec2_image_id` in the [group_vars](../group_vars). This instance ID is different in different AWS regions. It will also change with image updates (e.g., when Amazon updates the Ubuntu version of their standard image).

Amazon descibes [here](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/finding-an-ami.html#finding-an-ami-console) how to find the AMI ID for your desired region, OS, architecture, etc.

Expand Down
4 changes: 0 additions & 4 deletions group_vars/.gitignore

This file was deleted.

7 changes: 0 additions & 7 deletions group_vars/single/main.yml

This file was deleted.

61 changes: 61 additions & 0 deletions src/ansible-controller.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---

##################################################################################
# This playbook sets up an ansible controller on AWS. This instance is able to #
# run the experiment playbook and gather the results. It is intended to be used #
# as a long-running instance where GitHub actions can send commands to, to #
# trigger experiment runs. #
##################################################################################

##########################################################################
# Setup Ansible Controller #
##########################################################################
- name: Create ansible controller
hosts: localhost
tasks:

- name: Template dynamic inventory config file
# we do this because we want to have the project id as an inventory filter (such that only ec2 instances with a corresponding tag are visible)
template:
src: resources/inventory/aws_ec2.yml.j2
dest: inventory/aws_ec2.yml
mode: 0755
vars:
prj_clear: True
is_ansible_controller: True

# Load variables
- name: Load group variables
include_role:
name: load-group-vars
vars:
groups_to_load:
- all

- name: Manually load variables for ansible_controller
include_vars:
file: "{{ external_group_vars_dir }}/ansible_controller/main.yml"

- name: Setup AWS VPC
include_role:
name: suite-aws-vpc-create

- name: Create AWS EC2 instance for ansible controller
include_role:
name: ansible-controller-ec2

- name: Setup ansible controller
hosts: ansible_controller

tasks:
- name: Load group variables
include_role:
name: load-group-vars
vars:
groups_to_load:
- all
- ansible_controller

- name: Configure the EC2 instance for running the AWS Ansible Experiment Suite
include_role:
name: ansible-controller-setup
4 changes: 2 additions & 2 deletions clear.yml → src/clear.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@
mode: 0755
vars:
prj_clear: True
is_ansible_controller: False

- name: Clear EC2 instances
include_role:
name: suite-aws-ec2-delete

- name: Clear VPC
include_role:
name: suite-aws-vpc-delete

14 changes: 14 additions & 0 deletions experiment-suite.yml → src/experiment-suite.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
---


##########################################################################
# Load Experiment State and Setup AWS #
##########################################################################
- name: Load group_vars from non-standard location
hosts: localhost

tasks:
- include_vars: "{{ external_group_vars_dir }}/all/main.yml"

##########################################################################
# Load Experiment State and Setup AWS #
##########################################################################
Expand All @@ -15,6 +25,7 @@
mode: 0755
vars:
prj_clear: False # controls whether to also filter for the suite or not (for a clear we want to include all instances of the project independent of the suite)
is_ansible_controller: False

# TODO [nku] a design file should be able to contain an experiment design in table for in list form (could use info on whether list of factor levels is defined)
- name: resolve suite_id, load and validate suite design, fill default values, and prepare variables
Expand Down Expand Up @@ -60,6 +71,9 @@
strategy: free
tasks:

- name: Load group_vars from non-standard location
include_vars: "{{ external_group_vars_dir }}/all/main.yml"

- name: Execute init roles (incl. common roles for all hosts)
include_role:
name: "{{ role_name }}"
Expand Down
File renamed without changes.
3 changes: 3 additions & 0 deletions src/group_vars/all/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
does_project_dir: "{{ lookup('env', 'DOES_PROJECT_DIR') }}"
does_config_dir: "{{ does_project_dir }}/does_config"
external_group_vars_dir: "{{ does_config_dir }}/group_vars"
2 changes: 2 additions & 0 deletions src/inventory/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# This is generated dynamically by ansible
aws_ec2.yml
File renamed without changes.
29 changes: 29 additions & 0 deletions src/resources/awscli/aws_cli_team_gpg_key.pub
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
-----BEGIN PGP PUBLIC KEY BLOCK-----

mQINBF2Cr7UBEADJZHcgusOJl7ENSyumXh85z0TRV0xJorM2B/JL0kHOyigQluUG
ZMLhENaG0bYatdrKP+3H91lvK050pXwnO/R7fB/FSTouki4ciIx5OuLlnJZIxSzx
PqGl0mkxImLNbGWoi6Lto0LYxqHN2iQtzlwTVmq9733zd3XfcXrZ3+LblHAgEt5G
TfNxEKJ8soPLyWmwDH6HWCnjZ/aIQRBTIQ05uVeEoYxSh6wOai7ss/KveoSNBbYz
gbdzoqI2Y8cgH2nbfgp3DSasaLZEdCSsIsK1u05CinE7k2qZ7KgKAUIcT/cR/grk
C6VwsnDU0OUCideXcQ8WeHutqvgZH1JgKDbznoIzeQHJD238GEu+eKhRHcz8/jeG
94zkcgJOz3KbZGYMiTh277Fvj9zzvZsbMBCedV1BTg3TqgvdX4bdkhf5cH+7NtWO
lrFj6UwAsGukBTAOxC0l/dnSmZhJ7Z1KmEWilro/gOrjtOxqRQutlIqG22TaqoPG
fYVN+en3Zwbt97kcgZDwqbuykNt64oZWc4XKCa3mprEGC3IbJTBFqglXmZ7l9ywG
EEUJYOlb2XrSuPWml39beWdKM8kzr1OjnlOm6+lpTRCBfo0wa9F8YZRhHPAkwKkX
XDeOGpWRj4ohOx0d2GWkyV5xyN14p2tQOCdOODmz80yUTgRpPVQUtOEhXQARAQAB
tCFBV1MgQ0xJIFRlYW0gPGF3cy1jbGlAYW1hem9uLmNvbT6JAlQEEwEIAD4WIQT7
Xbd/1cEYuAURraimMQrMRnJHXAUCXYKvtQIbAwUJB4TOAAULCQgHAgYVCgkICwIE
FgIDAQIeAQIXgAAKCRCmMQrMRnJHXJIXEAChLUIkg80uPUkGjE3jejvQSA1aWuAM
yzy6fdpdlRUz6M6nmsUhOExjVIvibEJpzK5mhuSZ4lb0vJ2ZUPgCv4zs2nBd7BGJ
MxKiWgBReGvTdqZ0SzyYH4PYCJSE732x/Fw9hfnh1dMTXNcrQXzwOmmFNNegG0Ox
au+VnpcR5Kz3smiTrIwZbRudo1ijhCYPQ7t5CMp9kjC6bObvy1hSIg2xNbMAN/Do
ikebAl36uA6Y/Uczjj3GxZW4ZWeFirMidKbtqvUz2y0UFszobjiBSqZZHCreC34B
hw9bFNpuWC/0SrXgohdsc6vK50pDGdV5kM2qo9tMQ/izsAwTh/d/GzZv8H4lV9eO
tEis+EpR497PaxKKh9tJf0N6Q1YLRHof5xePZtOIlS3gfvsH5hXA3HJ9yIxb8T0H
QYmVr3aIUes20i6meI3fuV36VFupwfrTKaL7VXnsrK2fq5cRvyJLNzXucg0WAjPF
RrAGLzY7nP1xeg1a0aeP+pdsqjqlPJom8OCWc1+6DWbg0jsC74WoesAqgBItODMB
rsal1y/q+bPzpsnWjzHV8+1/EtZmSc8ZUGSJOPkfC7hObnfkl18h+1QtKTjZme4d
H17gsBJr+opwJw/Zio2LMjQBOqlm3K1A4zFTh7wBC7He6KPQea1p2XAMgtvATtNe
YLZATHZKTJyiqA==
=vYOk
-----END PGP PUBLIC KEY BLOCK-----
File renamed without changes
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,15 @@ plugin: aws_ec2
regions:
- eu-central-1
filters:
{% if is_ansible_controller %} # for the ansible controller, we only filter for controllers but not projects
tag:Name: ansible_controller
{% else %}
tag:prj_id: {{ prj_id }}
{% if not prj_clear %}
instance-state-name: ["running"]
tag:suite: {{ suite }}
{% endif %}
{% endif %}

# keyed_groups may be used to create custom groups
#leading_separator: False
Expand All @@ -17,19 +20,19 @@ keyed_groups:
- prefix: ""
separator: ""
key: tags.prj_id

- prefix: ""
separator: ""
key: tags.suite

- prefix: ""
separator: ""
key: tags.exp_name

- prefix: ""
separator: ""
key: tags.host_type

- prefix: "is_controller"
separator: "_"
key: tags.is_controller
Expand All @@ -38,4 +41,6 @@ keyed_groups:
separator: "_"
key: tags.check_status


- prefix: ""
separator: ""
key: tags.Name
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,17 @@ git_remote_repository: <<git_remote_repository>> # TODO: set remote repository (
# The experiments are mostly run concurrently (apart from the setup and cleanup parts). Thus, the experiment with the most jobs defines the
# maximal duration. But as experiments usually use fewer than 'job_n_tries' tries, an experiment with few long-running jobs can be the bottleneck too.
job_n_tries: <<job_n_tries>> # should be max 1000 (otherwise playbook freezes -> unsure why)
# TODO [mh]: must be tested if this is still an issue, I switched to an `until` loop
job_check_wait_time: <<job_check_wait_time>>

remote:
dir: "/home/ubuntu"
results_dir: "/home/ubuntu/results"

exp_code_dir: "{{ remote.dir }}/code"

local:
results_dir: "./results"
results_dir: "{{ does_project_dir }}/does_results"
designs_dir: "{{ does_config_dir }}/designs"

exp_base:
key_name: <<key_name>> # TODO: add key pair name
Expand All @@ -31,10 +32,3 @@ exp_base:
vpc_subnet_cidr: 10.100.0.0/24
sg_name: "{{ prj_id }}_sg"
sg_desc: "{{ prj_id }} security group"

separator: '_SEP_'

# This prefix (incl. the capitalization) is chosen by the ec2 plugin
ec2_tag_name_prefix: 'tag_Name_'
ec2_tag_prj_prefix: 'tag_Prj_'
ec2_tag_exp_prefix: 'tag_Exp_'
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---

# The controller doesn't need to have beefy specs, it only distributes tasks.
# However, we need enough space for the results.

instance_type: <<instance_type>> # change if you feel the controller is a bottleneck
ec2_volume_size: <<volume_size>> # choose large enough to store results (at least until they are downloaded)

ec2_image_id: <<ec2_image_id>>
ec2_volume_snapshot: <<snapshot_id>>

ansible_controller_user: controller
ansible_controller_group: controller
ansible_controller_home: "/home/{{ ansible_controller_user }}"

# AWS Ansible Experiment Suite repo options
ansible_exp_suite_git_repo: <<ansible_exp_suite_git_repo>> # TODO: change to the URL of your clone of the ansible experiment suite
ansible_exp_suite_private_repo: yes # Set to true if the repo is public. In that case we don't need to setup SSH keys for the repo.
ansible_controller_does_dir: "{{ ansible_controller_home }}/aws_ansible_experiment_suite" # Path to where AWS Ansible Experiment Suite is cloned to

# SSH options
ansible_controller_ssh_key_dir: "{{ ansible_controller_home }}/.ssh"
ansible_controller_ssh_key_types:
git: ecdsa
aws: rsa
ansible_controller_ssh_key_sizes:
git: 521
aws: 4096

ansible_controller_ssh_key_paths:
git: "{{ ansible_controller_ssh_key_dir }}/id_ssh_git"
aws: "{{ ansible_controller_ssh_key_dir }}/{{ exp_base.key_name }}"
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
instance_type: <<instance_type>> # TODO: choose instance type
ec2_volume_size: <<volume_size>> # TODO: choose volume size

ec2_image: <<ec2_image>>
ec2_image_id: <<ec2_image_id>>
ec2_volume_snapshot: <<snapshot_id>>
40 changes: 40 additions & 0 deletions src/roles/ansible-controller-clear/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---

- name: Collect info about running ec2 instances
community.aws.ec2_instance_info:
region: "{{ exp_base.aws_region }}"
filters:
"tag:name": "{{ ansible_controller }}"
register: ec2_instance_info

# TODO: remove
- debug:
msg: "ec2_instance_info: {{ ec2_instance_info }}"

- name: Extract instance ids of ec2 instances to remove
ansible.builtin.set_fact:
ec2_instance_ids: "{{ ec2_instance_info | json_query('*.instances[*].instance_id') | list | flatten }}"

# TODO: remove
- debug:
msg: "ec2_instance_ids: {{ ec2_instance_ids }}"

- ansible.builtin.pause:
seconds: 10
prompt: |
"Removing instances with the following ids:
{{ ec2_instance_ids }}
If the above instance IDs are wrong, abort now! (CTRL+C followed by 'A')"
tags: [print_action]

- name: Cleanup AWS
community.aws.ec2_instance:
instance_ids: "{{ ec2_instance_ids }}"
region: "{{ exp_base.aws_region }}"
state: absent
when: (ec2_instance_ids | length) > 0

- name: Remove AWS VPC
ansible.builtin.include_role:
name: suite-aws-vpc-delete
Loading

0 comments on commit bbf139a

Please sign in to comment.