Skip to content

Commit

Permalink
adding tmate session and documentation around it
Browse files Browse the repository at this point in the history
Signed-off-by: greg pereira <[email protected]>
  • Loading branch information
Gregory-Pereira committed May 17, 2024
1 parent f296bf2 commit eb82fb2
Show file tree
Hide file tree
Showing 4 changed files with 115 additions and 0 deletions.
52 changes: 52 additions & 0 deletions .github/workflows/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Workflow Docs

## Tmate action

The following is a rundown of the tmate action used in most of the workflows. Its structure looks something like this:

```github-action
- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true
```

While it may seem obvious to some, it is important to note that the workflow will not complete until the tmate action step completes.
Since we have concurrency set on most of our workflows, this means that if you push another run of the workflow, you must first close your SSH session.
More information on this is available in the following section [When / Why does the action step close](./README.md#when--why-does-the-action-step-close).
It is for this reason that this may not be useful in every situation.

### When / Why does the action step close?

This action will wait for one of two cases, the first of which is connection close. The SSH session only supports a single connection,
if you ssh and close the connection the action step will close and the workflow will proceed, even if you have not finished the `timeout-minutes` window.
Note also that as it only supports a single connection, only one person can ssh to the tmate sessions, others will be rejected.
The second condition is that the `timeout-minutes` elapse, in which case the action will boot you out of ssh, the session will close and the worfklow will continue.

### Configurations

The key values are `timeout-minutes`, `detached` and `limit-access-to-actor`.

#### Detached mode

If the action step is ran with `detached: true`, it will proceed to the next action steps unhindered.
If the workflow finishes before the `timeout-minutes` has elapsed, it will pop open a new action step at the end of the workflow to wait for and cleanup the tmate action.
If the step is instead ran with `detached: false` the workflow will not proceed until the step closes.

#### Limit access to actor

With `limit-access-to-actor` set to `true`, the action look who created the PR, and grab the public SSH keys stored in their Github account.
It will reject connections from any SSH private key that does not match the public key listed in the Github account.
This is recommended, as it prevents others from abusing your runners, but may be dissabled to allow a teamate to ssh instead.

### How does this action step work with Terraform / EC2 instances?

This is a great question! Its important to know that there are 2 parrallel tracks of CI in this example, the first being Github actions + the Runner,
and the second being Ansible playbooks, ran on the runner but SSH to an EC2 instance. Imagine that our workflow starts with Github actions,
which then calls the ansible playbook and does some stuff on our EC2 over ssh. Imagine then we get to something we want to debug,
and we open a `deteached` SSH session. Since it is detached the workflow will proceed and hit the step to tear down the EC2, making it no longer reachable via ssh.
For this reason you will probably have to run the Tmate session with `detached: false` and or add a timeout step to the ansible playbook,
to make sure you still have something that the runner can SSH into.
7 changes: 7 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,13 @@ jobs:
run: |
go build -o "worker_$(go env GOOS)_${GOARCH}" main.go
echo bin="worker_$(go env GOOS)_${GOARCH}" >> "$GITHUB_OUTPUT"
- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true
working-directory: ./worker
- uses: actions/upload-artifact@v4
with:
Expand Down
40 changes: 40 additions & 0 deletions .github/workflows/images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,14 @@ jobs:
cache-to: type=gha,mode=max
file: gobot/Containerfile

- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true

push_to_registries_ui:
name: Push UI container image to GHCR
runs-on: ubuntu-latest
Expand Down Expand Up @@ -90,6 +98,14 @@ jobs:
cache-to: type=gha,mode=max
file: ui/Containerfile

- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true

push_to_registries_apiserver:
name: Push apiserver container image to GHCR
runs-on: ubuntu-latest
Expand Down Expand Up @@ -132,6 +148,14 @@ jobs:
cache-to: type=gha,mode=max
file: ui/apiserver/Containerfile

- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true

push_to_registries_serve:
name: Push serve container image to GHCR
runs-on: ubuntu-latest
Expand Down Expand Up @@ -188,6 +212,14 @@ jobs:
cache-to: type=gha,mode=max
file: worker/Containerfile

- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true

push_to_registries_serve_base:
name: Push serve base container image to GHCR
runs-on: ubuntu-latest
Expand Down Expand Up @@ -243,3 +275,11 @@ jobs:
cache-from: type=gha
cache-to: type=gha,mode=max
file: worker/Containerfile.servebase

- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true
16 changes: 16 additions & 0 deletions .github/workflows/qa-ec2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,14 @@ jobs:
--vault-password-file ansible_vault_password_file \
deploy/ansible/qa/prod/deploy-worker-script.yml
- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true

- name: Terminate EC2 Instances
if: always()
run: |
Expand Down Expand Up @@ -133,6 +141,14 @@ jobs:
# -e "github_token=${BOT_GITHUB_TOKEN}" deploy/ansible/deploy-bot.yml
# rm -f ansible_vault_password_file

- name: Setup tmate session
if: ${{ failure() }}
uses: mxschmitt/[email protected]
timeout-minutes: 15
with:
detached: false
limit-access-to-actor: true

- name: Terminate EC2 Instances
if: always()
run: |
Expand Down

0 comments on commit eb82fb2

Please sign in to comment.