Skip to content

Commit eb82fb2

Browse files
adding tmate session and documentation around it
Signed-off-by: greg pereira <[email protected]>
1 parent f296bf2 commit eb82fb2

File tree

4 files changed

+115
-0
lines changed

4 files changed

+115
-0
lines changed

.github/workflows/README.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Workflow Docs
2+
3+
## Tmate action
4+
5+
The following is a rundown of the tmate action used in most of the workflows. Its structure looks something like this:
6+
7+
```github-action
8+
- name: Setup tmate session
9+
if: ${{ failure() }}
10+
uses: mxschmitt/[email protected]
11+
timeout-minutes: 15
12+
with:
13+
detached: false
14+
limit-access-to-actor: true
15+
```
16+
17+
While it may seem obvious to some, it is important to note that the workflow will not complete until the tmate action step completes.
18+
Since we have concurrency set on most of our workflows, this means that if you push another run of the workflow, you must first close your SSH session.
19+
More information on this is available in the following section [When / Why does the action step close](./README.md#when--why-does-the-action-step-close).
20+
It is for this reason that this may not be useful in every situation.
21+
22+
### When / Why does the action step close?
23+
24+
This action will wait for one of two cases, the first of which is connection close. The SSH session only supports a single connection,
25+
if you ssh and close the connection the action step will close and the workflow will proceed, even if you have not finished the `timeout-minutes` window.
26+
Note also that as it only supports a single connection, only one person can ssh to the tmate sessions, others will be rejected.
27+
The second condition is that the `timeout-minutes` elapse, in which case the action will boot you out of ssh, the session will close and the worfklow will continue.
28+
29+
### Configurations
30+
31+
The key values are `timeout-minutes`, `detached` and `limit-access-to-actor`.
32+
33+
#### Detached mode
34+
35+
If the action step is ran with `detached: true`, it will proceed to the next action steps unhindered.
36+
If the workflow finishes before the `timeout-minutes` has elapsed, it will pop open a new action step at the end of the workflow to wait for and cleanup the tmate action.
37+
If the step is instead ran with `detached: false` the workflow will not proceed until the step closes.
38+
39+
#### Limit access to actor
40+
41+
With `limit-access-to-actor` set to `true`, the action look who created the PR, and grab the public SSH keys stored in their Github account.
42+
It will reject connections from any SSH private key that does not match the public key listed in the Github account.
43+
This is recommended, as it prevents others from abusing your runners, but may be dissabled to allow a teamate to ssh instead.
44+
45+
### How does this action step work with Terraform / EC2 instances?
46+
47+
This is a great question! Its important to know that there are 2 parrallel tracks of CI in this example, the first being Github actions + the Runner,
48+
and the second being Ansible playbooks, ran on the runner but SSH to an EC2 instance. Imagine that our workflow starts with Github actions,
49+
which then calls the ansible playbook and does some stuff on our EC2 over ssh. Imagine then we get to something we want to debug,
50+
and we open a `deteached` SSH session. Since it is detached the workflow will proceed and hit the step to tear down the EC2, making it no longer reachable via ssh.
51+
For this reason you will probably have to run the Tmate session with `detached: false` and or add a timeout step to the ansible playbook,
52+
to make sure you still have something that the runner can SSH into.

.github/workflows/build.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,13 @@ jobs:
2828
run: |
2929
go build -o "worker_$(go env GOOS)_${GOARCH}" main.go
3030
echo bin="worker_$(go env GOOS)_${GOARCH}" >> "$GITHUB_OUTPUT"
31+
- name: Setup tmate session
32+
if: ${{ failure() }}
33+
uses: mxschmitt/[email protected]
34+
timeout-minutes: 15
35+
with:
36+
detached: false
37+
limit-access-to-actor: true
3138
working-directory: ./worker
3239
- uses: actions/upload-artifact@v4
3340
with:

.github/workflows/images.yml

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,14 @@ jobs:
4848
cache-to: type=gha,mode=max
4949
file: gobot/Containerfile
5050

51+
- name: Setup tmate session
52+
if: ${{ failure() }}
53+
uses: mxschmitt/[email protected]
54+
timeout-minutes: 15
55+
with:
56+
detached: false
57+
limit-access-to-actor: true
58+
5159
push_to_registries_ui:
5260
name: Push UI container image to GHCR
5361
runs-on: ubuntu-latest
@@ -90,6 +98,14 @@ jobs:
9098
cache-to: type=gha,mode=max
9199
file: ui/Containerfile
92100

101+
- name: Setup tmate session
102+
if: ${{ failure() }}
103+
uses: mxschmitt/[email protected]
104+
timeout-minutes: 15
105+
with:
106+
detached: false
107+
limit-access-to-actor: true
108+
93109
push_to_registries_apiserver:
94110
name: Push apiserver container image to GHCR
95111
runs-on: ubuntu-latest
@@ -132,6 +148,14 @@ jobs:
132148
cache-to: type=gha,mode=max
133149
file: ui/apiserver/Containerfile
134150

151+
- name: Setup tmate session
152+
if: ${{ failure() }}
153+
uses: mxschmitt/[email protected]
154+
timeout-minutes: 15
155+
with:
156+
detached: false
157+
limit-access-to-actor: true
158+
135159
push_to_registries_serve:
136160
name: Push serve container image to GHCR
137161
runs-on: ubuntu-latest
@@ -188,6 +212,14 @@ jobs:
188212
cache-to: type=gha,mode=max
189213
file: worker/Containerfile
190214

215+
- name: Setup tmate session
216+
if: ${{ failure() }}
217+
uses: mxschmitt/[email protected]
218+
timeout-minutes: 15
219+
with:
220+
detached: false
221+
limit-access-to-actor: true
222+
191223
push_to_registries_serve_base:
192224
name: Push serve base container image to GHCR
193225
runs-on: ubuntu-latest
@@ -243,3 +275,11 @@ jobs:
243275
cache-from: type=gha
244276
cache-to: type=gha,mode=max
245277
file: worker/Containerfile.servebase
278+
279+
- name: Setup tmate session
280+
if: ${{ failure() }}
281+
uses: mxschmitt/[email protected]
282+
timeout-minutes: 15
283+
with:
284+
detached: false
285+
limit-access-to-actor: true

.github/workflows/qa-ec2.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,14 @@ jobs:
6262
--vault-password-file ansible_vault_password_file \
6363
deploy/ansible/qa/prod/deploy-worker-script.yml
6464
65+
- name: Setup tmate session
66+
if: ${{ failure() }}
67+
uses: mxschmitt/[email protected]
68+
timeout-minutes: 15
69+
with:
70+
detached: false
71+
limit-access-to-actor: true
72+
6573
- name: Terminate EC2 Instances
6674
if: always()
6775
run: |
@@ -133,6 +141,14 @@ jobs:
133141
# -e "github_token=${BOT_GITHUB_TOKEN}" deploy/ansible/deploy-bot.yml
134142
# rm -f ansible_vault_password_file
135143

144+
- name: Setup tmate session
145+
if: ${{ failure() }}
146+
uses: mxschmitt/[email protected]
147+
timeout-minutes: 15
148+
with:
149+
detached: false
150+
limit-access-to-actor: true
151+
136152
- name: Terminate EC2 Instances
137153
if: always()
138154
run: |

0 commit comments

Comments
 (0)