Skip to content

code-kern-ai/cicd-deployment-scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cicd-deployment-scripts

Scripts used for Kern AI CI/CD efforts.

Table of Contents

GitHub Actions

GitHub: Admin Repositories Settings

Workflow file: admin_update_repo_settings.yml

Triggers:

  • workflow_call

Description:

  • updates IaC repository General Settings and Rulesets

Jobs:

  • GitHub: Update General Repository Settings

    • Update General Repository Settings
  • GitHub: Update tf-module Rulesets

    • Update tf-module Rulesets
  • GitHub: Update tf-iac Rulesets

    • Update tf-iac Rulesets

ACR: Delete Docker Images

Workflow file: az_acr_delete.yml

Triggers:

  • workflow_call

Description:

  • deletes Container Images specified by the workflow input

Jobs:

  • Docker: Delete Test Tags

    • Configure branch name
    • Delete Container Image
  • Docker: Delete Branch Tags

    • Configure branch name
    • Delete Branch Container Image

ACR: Docker Push

Workflow file: az_acr_push.yml

Triggers:

  • workflow_dispatch
  • push

Description:

  • before pushing the Docker image, the branch name is resolved to replace / with - and the image is built with the resolved branch name
  • builds and deploys Docker images in multiple steps

Jobs:

  • Docker: Build & Push
    • Configure branch name
    • Build & Push <application-repo>:${{ matrix.platform }}-<feature-hotfix>
    • Build & Push <application-repo>:${{ matrix.platform }}-gpu

ACR: Docker Push Release

Workflow file: az_acr_release.yml

Triggers:

  • workflow_dispatch
  • pull_request_closed
  • release

Description:

  • builds and deploys Docker images in multiple steps

Jobs:

  • Docker: Build & Push
    • Build & Push <application-repo>:amd64
    • Build & Push <application-repo>:arm64
    • Build & Push <application-repo>:latest

ACR: Docker Push Test

Workflow file: az_acr_test.yml

Triggers:

  • pull_request_opened_synchronized

Outputs:

  • GH_REF_NAME

Description:

  • before pushing the Docker image, the branch name is resolved to replace / with - and the image is built with the resolved branch name
  • builds and deploys the test Docker Image used by the K8: Test workflow

Jobs:

  • Docker: Build & Push (Test)
    • Configure branch name
    • Build & Push <application-repo>:test-<feature-hotfix>

Azure: Function App Deployment

Workflow file: az_fnapp_deploy.yml

Triggers:

  • workflow_dispatch
  • push

Description:

  • builds and deploys the Azure Function App
  • currently used to deploy the self hosted GitHub Actions Runner Monitor

Jobs:

  • Azure: Build & Deploy Function App
    • Resolve Project Dependencies Using Pip
    • Run Azure Functions Action

GitHub: Delete Branch

Workflow file: gh_delete_branch.yml

Triggers:

  • pull_request_closed

Description:

  • calls ACR: Delete Docker Image job, targeting the tag :test-<feature/hotfix>
  • deletes the feature/hotfix branch Container Images (:<platform>-<feature/hotfix>)
  • deletes the feature/hotfix branch

Troubleshooting:

  • this job will fail when the feature/hotfix branch is deleted manually

Jobs:

  • ACR: Delete Test Image

    • Configure branch name
    • Delete Container Image
  • ACR: Delete Branch Images

    • Configure branch name
    • Delete Branch Container Image
  • GitHub: Delete Branch

    • Delete Branch

GitHub: Release

Workflow file: gh_release.yml

Triggers:

  • release

Inputs:

  • deployment_status

Description:

  • publishes a release on GitHub with the tag generated by the pre-release that triggered this workflow
  • deletes a pre-release on GitHub with the tag generated by the pre-release that triggered this workflow
  • runs in case of a release deployment failure

Troubleshooting:

  • after fixing the error that caused the release deployment failure, recreate the pre-release to trigger the release deployment again

Jobs:

  • GitHub: Publish Release

    • Publish Release
  • GitHub: Delete Prerelease

    • Delete Prerelease

GitHub: Validate Release

Workflow file: gh_validate_release.yml

Triggers:

  • release

Description:

  • validates the release tag generated by the pre-release that triggered this workflow, using a RegEx check for semantic versioning

Troubleshooting:

  • inspect the pre-release tag name and ensure it follows the RegEx check for semantic versioning

Jobs:

  • GitHub: Validate Release
    • Validate Release Tag

K8: Apply

Workflow file: k8s_apply.yml

Triggers:

  • pull_request_closed (dev)
  • workflow_dispatch

Description:

  • generates a Kubernetes kustomization diff and applies it to the cluster
  • differs from the k8s-deploy job in that it applies the entire namespace, as opposed to application specific configurations

Jobs:

  • K8: Apply Cluster Resources
    • Generate Kustomization
    • Apply Kustomization
    • Assert Deploy Success
    • Revert on failure

K8: Cluster Deploy

Workflow file: k8s_deploy.yml

Triggers:

  • workflow_call

Inputs:

  • environment

Outputs:

  • deployment_status

Description:

  • deploys the application to the Kubernetes cluster
  • differs from the k8s-apply job in that it applied application specific configurations, as opposed to the entire namespace
  • uses Canary Deployment strategy

Jobs:

  • K8: Deploy
    • Generate Kustomization
    • Generate Deployment
    • Assert Deployment Success
    • Promote Deployment
    • Reject Deployment

K8: Destroy

Workflow file: k8s_destroy.yml

Triggers:

  • workflow_dispatch

Description:

  • deletes all deployment and service Kubernetes resources in the namespace configured by GitHub Actions Environment Variables

Jobs:

  • K8: Destroy Cluster Namespace
    • Destroy Cluster Namespace

K8: Edit

Workflow file: k8s_edit.yml

Triggers:

  • pull_request_closed

Description:

  • updates the Kubernetes deployment image tags to the latest release
  • creates a new branch automated-release-dev and a corresponding Pull Request in k8-cluster-cognition repository
  • when a Pull Request already exists, deployment image tag updates are accumulated on the existing Pull Request

Jobs:

  • K8: Edit Cluster Deployment
    • Perform Edit/Git Operations

K8: Execution Environments

Workflow file: k8s_exec_env_pull.yml

Triggers:

  • workflow_dispatch

Description:

  • pulls execution environment images inside the Kubernetes cluster

Jobs:

  • K8: Docker Pulls
    • Execute docker pull

K8: Reload Secrets

Workflow file: k8s_reload_secrets.yml

Triggers:

  • workflow_dispatch

Inputs:

  • deployment_name

Description:

  • recreates a secret in the Kubernetes cluster with the latest value from Azure Key Vault, specified by the workflow input (deployment name)
  • restarts a deployment in the Kubernetes cluster, specified by the workflow input (deployment name)

Jobs:

  • K8: Reload Secrets
    • Run Secret Reload

K8: Release

Workflow file: k8s_release.yml

Triggers:

  • pull_request_closed
  • release

Description:

  • calls GitHub: Validate Release job
  • calls ACR: Docker Push Release job
  • calls K8: Edit job
  • calls GitHub: Release job
  • forwards deployment status to the GitHub: Release job
  • calls GitHub: Delete Branch job

Jobs:

  • call-gh-validate-release

  • call-az-acr-release

  • call-k8-edit

  • call-gh-release

  • GitHub: Delete Branch

    • Delete Branch

K8: Restart

Workflow file: k8s_restart.yml

Triggers:

  • workflow_dispatch

Inputs:

  • deployment_name

Description:

  • restarts a deployment in the Kubernetes cluster, specified by the workflow input

Jobs:

  • K8: Restart Cluster Deployment
    • Restart Cluster Deployment

K8: Test

Workflow file: k8s_test.yml

Triggers:

  • pull_request_opened_synchronized

Inputs:

  • test_cmd

Description:

  • calls ACR: Docker Push Test job
  • runs alemic upgrade on the application that triggered this workflow
  • if an application that depends on refinery-gateway database changes (e.g. refinery-tokenizer) triggers this workflow, the alembic upgrade is run on the refinery-gateway database if the same test Docker Image tag exists
  • uses the test Docker Image generated by the ACR: Docker Push Test job to run tests in the Kubernetes cluster
  • uses the revision number generated in the first step to downgrade the database

Troubleshooting:

  • in case of a failed test, inspect the logs of this job to identify the issue and resolve it by updating the application code
  • in case this workflow corrupted app.dev.kern.ai, manually run K8: Apply in k8-cluster-cognition to apply the latest container images available on dev
  • in case of a workflow failure (TBD), ignore the failure and proceed with Pull Request merge

Jobs:

  • call-az-acr-push-test

  • K8: Test Cluster Deployment

    • Test Cluster Deployment

Parent Images: Build

Workflow file: pi_build.yml

Triggers:

  • pull_request_opened_synchronized

Description:

  • builds & pushes refinery-parent-images:<branch>-<type> to registry.dev.kern.ai

Jobs:

  • Configure Head Branch Name

    • Configure branch name
  • pi-matrix

  • Parent Images: Docker Build

    • Set up Python
    • Install Dependencies
    • Compile Requirements
    • Build & Push refinery-parent-images:${{ needs.configure-branch-name.outputs.gh_head_ref }}-${{ matrix.parent_image_type }}
    • Build & Push refinery-parent-images:${{ needs.configure-branch-name.outputs.gh_head_ref }}-${{ matrix.parent_image_type }}-arm64

Parent Images: Matrix

Workflow file: pi_matrix.yml

Triggers:

  • workflow_call

Inputs:

  • repository
  • checkout_ref
  • parent_image_type

Outputs:

  • parent_image_type
  • include

Description:

  • creates a Matrix Strategy input for GitHub Action with the following structure:
  • { "parent_image_type": [ "mini", "next" ], "include": [ { "parent_image_type": "mini", "app": "refinery-authorizer" }, { "parent_image_type": "mini", "app": "refinery-gateway-proxy" }, { "parent_image_type": "next", "app": "admin-dashboard" }, { "parent_image_type": "next", "app": "refinery-ui" }, { "parent_image_type": "next", "app": "cognition-ui" } ] }

Jobs:

  • Parent Images: Generate Matrix
    • Generate Matrix

Parent Images: Submodule Merge

Workflow file: pi_merge_submodule.yml

Triggers:

  • pull_request_closed (dev)

Description:

  • updates Parent Image repositories' submodule reference

Jobs:

  • Configure Head Branch Name

    • Configure branch name
  • pi-matrix

  • Parent Images: Submodule

    • Set up Python
    • Install Dependencies
    • Perform Edit/Git Operations
  • GitHub: Delete Branch

    • Delete Branch

Parent Images: Parent Image Merge

Workflow file: pi_merge_parent_image.yml

Triggers:

  • pull_request_closed (dev)

Description:

  • builds & pushes refinery-parent-images:dev-<type> to registry.dev.kern.ai
  • updates Application repositories' -requirements.in and requirements.txt

Troubleshooting:

  • package version resolution failure (ResolutionImpossible) (example)
  • resolved by updating the package version in the Application repository's -requirements.in file
  • worked around by manually performing the requirements compilation

Jobs:

  • Configure Head Branch Name

    • Configure branch name
  • pi-matrix

  • Parent Images: Docker Build

    • Set up Python
    • Install Dependencies
    • Compile Requirements
    • Build & Push refinery-parent-images:${{ github.event.pull_request.base.ref }}-${{ env.PARENT_IMAGE_TYPE }}
    • Build & Push refinery-parent-images:${{ github.event.pull_request.base.ref }}-${{ env.PARENT_IMAGE_TYPE }}-arm64
    • Build & Push refinery-parent-images:sha-${{ env.PARENT_IMAGE_TYPE }}
    • Build & Push refinery-parent-images:sha-${{ env.PARENT_IMAGE_TYPE }}-arm64
  • Parent Images: App

    • Set up Python
    • Install Dependencies
    • Clone ${{ matrix.app }}
    • Compile Requirements (Python)
    • Compile Requirements (Next)
    • Perform Edit/Git Operations (Python)
    • Perform Edit/Git Operations (Next)
  • GitHub: Delete Branch

    • Delete Branch
  • GitHub: Delete Branch

    • Delete Branch

Parent Images: Release

Workflow file: pi_release.yml

Triggers:

  • prerelease

Description:

  • builds & pushes refinery-parent-images:vX.X.X-<type> to Docker Hub
  • updates Application repositories' Dockerfiles to use the new parent image (updates Application repositories' open PRs)

Jobs:

  • pi-matrix

  • Parent Images: Dockerfile

    • Perform Edit/Git Operations

OpenTofu: Release

Workflow file: release_please.yml

Triggers:

  • workflow_call

Description:

  • generates a release Pull Request with CHANGELOG updates for the calling repository
  • requires Conventional Commits

Jobs:

  • tf-module-release
    • googleapis/release-please-action@v4

OpenTofu: Generate Docs

Workflow file: tf_docs.yml

Triggers:

  • push

Description:

  • generates documentation for the OpenTofu module

Jobs:

  • tf-module-docs
    • actions/checkout@v4
    • Render OpenTofu docs and push changes back to PR

OpenTofu: Plan/Apply

Workflow file: tf_plan_apply.yml

Triggers:

  • workflow_dispatch
  • push

Outputs:

  • tf_plan_exit_code
  • tf_destroy

Description:

  • executes tofu plan on the repository that triggered this workflow
  • creates a destruction plan when the calling repository's GitHub Actions Environment Variable TF_DESTROY is set to -destroy
  • executes tofu apply on the repository that triggered this workflow, assuming that the tofu plan job has succeeded

Troubleshooting:

  • inspect the logs of the tofu plan job to identify the issue and resolve it by updating Infrastructure as Code (IaC) files
  • inspect the logs of the tofu plan job to identify the issue and resolve it by updating Infrastructure as Code (IaC) files

Jobs:

  • OpenTofu Plan

    • OpenTofu Plan
  • OpenTofu Apply

    • OpenTofu Apply