Skip to content

feat: use dynamodb instead of ssm for JIT config #4446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions examples/ephemeral-multiarch-prebuilt/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Ephemeral Multi-Architecture Prebuilt Runners

This example demonstrates how to create GitHub action runners with the following features:

- **Ephemeral Runners**: Runners are used for one job only and terminated after completion
- **Multi-Architecture Support**: Configures both x64 and ARM64 runners
- **Prebuilt AMIs**: Uses custom prebuilt AMIs for faster startup times
- **DynamoDB Storage**: Uses DynamoDB instead of Parameter Store to avoid rate limiting issues
- **Cleanup for Offline Runners**: Includes a lambda to clean up registered offline runners from the organization

## Usages

Steps for the full setup, such as creating a GitHub app can be found in the [docs](https://github-aws-runners.github.io/terraform-aws-github-runner/getting-started/). First download the Lambda releases from GitHub. Alternatively you can build the lambdas locally with Node or Docker, there is a simple build script in `<root>/.ci/build.sh`. In the `main.tf` you can simply remove the location of the lambda zip files, the default location will work in this case.

> The default example assumes local built lambda's available. Ensure you have built the lambda's. Alternatively you can download the lambda's. The version needs to be set to a GitHub release version, see https://github.com/github-aws-runners/terraform-aws-github-runner/releases

```bash
cd ../lambdas-download
terraform init
terraform apply -var=module_version=<VERSION>
cd -
```


### Packer Images

You will need to build your images for both x64 and ARM64 architectures. This example deployment uses the images in `/images/linux-al2023`. You must build these images with packer in your AWS account first. Once you have built them, you need to provide your owner ID as a variable.

### Deploy

Before running Terraform, ensure the GitHub app is configured. See the [configuration details](https://github-aws-runners.github.io/terraform-aws-github-runner/configuration/#ephemeral-runners) for more details.

```bash
terraform init
terraform apply
```


The module will try to update the GitHub App webhook and secret (only linux/mac). You can receive the webhook details by running:

```bash
terraform output webhook_secret
```


## Features

### Ephemeral Runners

Ephemeral runners are used for one job only. Each job requires a fresh instance. This feature should be used in combination with the `workflow_job` event. See GitHub webhook endpoint configuration in the documentation.

### Multi-Architecture Support

This example configures both x64 and ARM64 runners with appropriate labels. The module will decide the runner for the workflow job based on the match in the labels defined in the workflow job and runner configuration.

### DynamoDB Storage

This example uses DynamoDB instead of Parameter Store to store runner configuration and state. This helps avoid rate limiting issues that can occur with Parameter Store when managing many runners.

### Cleanup for Offline Runners

The example includes a lambda function that periodically checks for and removes registered offline runners from the organization. This is particularly useful for handling cases where spot instances are terminated by AWS while still running a job.

<!-- BEGIN_TF_DOCS -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.3.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | ~> 5.27 |
| <a name="requirement_local"></a> [local](#requirement\_local) | ~> 2.0 |
| <a name="requirement_random"></a> [random](#requirement\_random) | ~> 3.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_random"></a> [random](#provider\_random) | 3.6.3 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_base"></a> [base](#module\_base) | ../base | n/a |
| <a name="module_runners"></a> [runners](#module\_runners) | ../../modules/multi-runner | n/a |
| <a name="module_webhook_github_app"></a> [webhook\_github\_app](#module\_webhook\_github\_app) | ../../modules/webhook-github-app | n/a |

## Resources

| Name | Type |
|------|------|
| [random_id.random](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/id) | resource |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_aws_region"></a> [aws\_region](#input\_aws\_region) | AWS region to deploy to | `string` | `"eu-west-1"` | no |
| <a name="input_environment"></a> [environment](#input\_environment) | Environment name, used as prefix | `string` | `null` | no |
| <a name="input_github_app"></a> [github\_app](#input\_github\_app) | GitHub for API usages. | <pre>object({<br/> id = string<br/> key_base64 = string<br/> })</pre> | n/a | yes |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_webhook_endpoint"></a> [webhook\_endpoint](#output\_webhook\_endpoint) | n/a |
| <a name="output_webhook_secret"></a> [webhook\_secret](#output\_webhook\_secret) | n/a |
<!-- END_TF_DOCS -->
103 changes: 103 additions & 0 deletions examples/ephemeral-multiarch-prebuilt/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
locals {
webhook_secret = random_id.random.hex

multi_runner_config = { for c in fileset("${path.module}/templates/runner-configs", "*.yaml") : trimsuffix(c, ".yaml") => yamldecode(file("${path.module}/templates/runner-configs/${c}")) }
}

resource "random_id" "random" {
byte_length = 20
}

module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.0.0"

name = "${var.environment}-vpc"
cidr = "10.0.0.0/16"

azs = ["${var.aws_region}a", "${var.aws_region}b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]

enable_dns_hostnames = true
enable_nat_gateway = false
map_public_ip_on_launch = true

tags = {
Environment = var.environment
}
}

module "dynamodb" {
source = "../../modules/dynamodb"

table_name = "${var.environment}-runner-config"
billing_mode = "PAY_PER_REQUEST"
tags = {
Environment = var.environment
}
}

module "runners" {
source = "../../modules/multi-runner"
aws_region = var.aws_region
multi_runner_config = local.multi_runner_config
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.public_subnets
runners_scale_up_lambda_timeout = 60
runners_scale_down_lambda_timeout = 60
cleanup_org_runners = var.cleanup_org_runners
prefix = var.environment
dynamodb_arn = module.dynamodb.table_arn
dynamodb_table_name = module.dynamodb.table_name
tags = {
Environment = var.environment
}
github_app = {
key_base64 = var.github_app.key_base64
id = var.github_app.id
webhook_secret = random_id.random.hex
}

logging_retention_in_days = 7

# Deploy webhook using the EventBridge
eventbridge = {
enable = true
# adjust the allow events to only allow specific events, like workflow_job
accept_events = ["workflow_job"]
}

webhook_lambda_zip = "../../lambda_output/webhook.zip"
runners_lambda_zip = "../../lambda_output/runners.zip"

instance_termination_watcher = {
enable = true
}

runners_ssm_housekeeper = {
state = "DISABLED"
config = {}
}

metrics = {
enable = true
metric = {
enable_github_app_rate_limit = true
enable_job_retry = true
enable_spot_termination_warning = true
}
}
}

module "webhook_github_app" {
source = "../../modules/webhook-github-app"
depends_on = [module.runners]

github_app = {
key_base64 = var.github_app.key_base64
id = var.github_app.id
webhook_secret = local.webhook_secret
}
webhook_endpoint = module.runners.webhook.endpoint
}
8 changes: 8 additions & 0 deletions examples/ephemeral-multiarch-prebuilt/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
output "webhook_endpoint" {
value = module.runners.webhook.endpoint
}

output "webhook_secret" {
sensitive = true
value = random_id.random.hex
}
9 changes: 9 additions & 0 deletions examples/ephemeral-multiarch-prebuilt/providers.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
provider "aws" {
region = var.aws_region

default_tags {
tags = {
Environment = var.environment
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
matcherConfig:
exactMatch: true
labelMatchers:
- [self-hosted, linux, x64, ephemeral]
fifo: true
redrive_build_queue:
enabled: true
maxReceiveCount: 3
runner_config:
runner_os: linux
runner_architecture: x64
runner_run_as: ubuntu
runner_name_prefix: ubuntu-2204-amd64_
enable_ssm_on_runners: true
credit_specification: standard
instance_types:
- m7a.large
- m7i.large
- m7i-flex.large
- m6a.large
- m6i.large
runners_maximum_count: 256
delay_webhook_event: 0
scale_down_schedule_expression: cron(* * * * ? *)
userdata_template: ./templates/user-data.sh
enable_userdata: true
ami_owners:
- "self"
ami_filter:
name:
- github-runner-ubuntu-jammy-amd64-*
state:
- available
enable_organization_runners: true
enable_ephemeral_runners: true
enable_job_queued_check: true
minimum_running_time_in_minutes: 2
enable_runner_binaries_syncer: false
create_service_linked_role_spot: true
scale_up_reserved_concurrent_executions: 12
lambda_architecture: arm64
job_retry:
enabled: true
max_attempts: 3
delay_in_seconds: 180
block_device_mappings:
- device_name: /dev/xvda
delete_on_termination: true
volume_type: gp3
volume_size: 40
encrypted: true
iops: null
throughput: null
kms_key_id: null
snapshot_id: null
runner_log_files:
- log_group_name: syslog
prefix_log_group: true
file_path: /var/log/syslog
log_stream_name: "{instance_id}"
runner_hook_job_started: |
echo "Running pre job hook as $(whoami)"
runner_hook_job_completed: |
echo "Running post job hook as $(whoami)"
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
matcherConfig:
exactMatch: true
labelMatchers:
- [self-hosted, linux, arm64, ephemeral]
fifo: true
redrive_build_queue:
enabled: true
maxReceiveCount: 3
runner_config:
runner_os: linux
runner_architecture: arm64
runner_run_as: ubuntu
runner_name_prefix: ubuntu-2204-arm64_
enable_ssm_on_runners: true
credit_specification: standard
instance_types:
- m8g.large
- m7g.large
- m6g.large
runners_maximum_count: 256
delay_webhook_event: 0
scale_down_schedule_expression: cron(* * * * ? *)
userdata_template: ./templates/user-data.sh
enable_userdata: true
ami_owners:
- "self"
ami_filter:
name:
- github-runner-ubuntu-jammy-arm64-*
state:
- available
enable_organization_runners: true
enable_ephemeral_runners: true
enable_job_queued_check: true
minimum_running_time_in_minutes: 2
enable_runner_binaries_syncer: false
create_service_linked_role_spot: true
scale_up_reserved_concurrent_executions: 12
lambda_architecture: arm64
job_retry:
enabled: true
max_attempts: 3
delay_in_seconds: 180
block_device_mappings:
- device_name: /dev/xvda
delete_on_termination: true
volume_type: gp3
volume_size: 40
encrypted: true
iops: null
throughput: null
kms_key_id: null
snapshot_id: null
runner_log_files:
- log_group_name: syslog
prefix_log_group: true
file_path: /var/log/syslog
log_stream_name: "{instance_id}"
runner_hook_job_started: |
echo "Running pre job hook as $(whoami)"
runner_hook_job_completed: |
echo "Running post job hook as $(whoami)"
31 changes: 31 additions & 0 deletions examples/ephemeral-multiarch-prebuilt/templates/user-data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/bin/bash
exec > >(tee /var/log/user-data.log | logger -t user-data -s 2>/dev/console) 2>&1


# AWS suggest to create a log for debug purpose based on https://aws.amazon.com/premiumsupport/knowledge-center/ec2-linux-log-user-data/
# As side effect all command, set +x disable debugging explicitly.
#
# An alternative for masking tokens could be: exec > >(sed 's/--token\ [^ ]* /--token\ *** /g' > /var/log/user-data.log) 2>&1
set +x

%{ if enable_debug_logging }
set -x
%{ endif }

cd /opt/actions-runner

%{ if hook_job_started != "" }
cat > /opt/actions-runner/hook_job_started.sh <<'EOF'
${hook_job_started}
EOF
echo ACTIONS_RUNNER_HOOK_JOB_STARTED=/opt/actions-runner/hook_job_started.sh | tee -a /opt/actions-runner/.env
%{ endif }

%{ if hook_job_completed != "" }
cat > /opt/actions-runner/hook_job_completed.sh <<'EOF'
${hook_job_completed}
EOF
echo ACTIONS_RUNNER_HOOK_JOB_COMPLETED=/opt/actions-runner/hook_job_completed.sh | tee -a /opt/actions-runner/.env
%{ endif }

${start_runner}
Loading
Loading