Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blog - Konfig - How Earthly Solved Our CI Problem #758

Merged
merged 9 commits into from
Jan 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/styles/HouseStyle/tech-terms/general.txt
Original file line number Diff line number Diff line change
Expand Up @@ -606,4 +606,6 @@ Ryzen
imagetools
multicontainer
retrigger

Konfig
docformatter
6 changes: 5 additions & 1 deletion blog/_data/authors.yml
Original file line number Diff line number Diff line change
Expand Up @@ -557,7 +557,11 @@ Rajkumar Venkatasamy:
name : "Rajkumar Venkatasamy"
bio : "Rajkumar has nearly sixteen years of experience in the software industry as a developer, data modeler, tester, project lead, product consultant, data architect, ETL specialist, and technical architect. Currently he is a principal architect at an MNC. He has hands-on experience in various technologies, tools, and libraries."
avatar : "/assets/images/authors/rajkumar-venkatasamy.jpg"
Eddie Chayes:
name : "Eddie Chayes"
bio : "Eddie is an early career software engineer with a few years of experience working with AI platforms; he’s now a founding engineer at Konfig, a startup that generates SDKs for API companies."
avatar : "/assets/images/authors/eddie-chayes.jpeg"
Furqan Butt:
name : "Furqan Butt"
bio : "Furqan is a software developer with more than three years of experience in the computer software industry. He's completed the AWS Certified Solutions Architect and Certified Cloud Practitioner certifications, writes on Medium for publications including Towards Data Science, and is part of the AWS Community Builders Program for data and analytics. His core expertise is in data engineering, big data, cloud technologies, and backend development."
avatar : "/assets/images/authors/furqan-butt.jpg"
avatar : "/assets/images/authors/furqan-butt.jpg"
97 changes: 97 additions & 0 deletions blog/_posts/2024-01-22-konfig.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
---
title: "How Earthly Solved Our CI Problem"
categories:
- Tutorials
toc: true
author: Eddie Chayes

internal-links:
- ci problem solved
- how earthly helps with ci problems
- solving ci problems with earthly
- earthly the problem solver
excerpt: |
In this guest post, the Konfig team discusses how they solved their complex Continuous Integration challenges with Earthly, sharing insights valuable for any software development team.
---

msali123 marked this conversation as resolved.
Show resolved Hide resolved
![*In this guest post, the [Konfig](https://konfigthis.com/) team discusses how they solved their complex Continuous Integration challenges with Earthly, sharing insights valuable for any software development team.*]({{site.images}}{{page.slug}}/img3.png)

The value of a continuous integration pipeline in software development is clear. It allows engineers to test every code change, ensuring deployments are free from regressions in functionality and performance. At Konfig, a startup creating SDKs in many coding languages from our clients' OpenAPI specs, not having Continuous Integration was a major issue. Code generation software tends to be especially fickle, and many of our code paths are shared across generators for different coding languages, making it easy to introduce bugs. When we made revisions for one customer, it wasn't uncommon for these changes to be incompatible for another customer. We badly needed CI -- and quickly, too. Thus we embarked on (what we thought would be) the treacherous journey of building a CI pipeline that could accommodate our architecture.

![Img]({{site.images}}{{page.slug}}/img1.png)\
msali123 marked this conversation as resolved.
Show resolved Hide resolved

## The Problem: Multiple Languages and Components

From the get-go, we presumed that building an efficient CI pipeline for Konfig would not be straightforward. None of us had any experience creating a CI pipeline for a multi-service architecture. Our product naturally interacts with many languages, necessitating the deployment of different services tailored to each language. For instance, certain functionalities, like formatting, require environments that support their specific language. Despite our complicated architecture, we wanted our CI to be speedy and cost-effective. As a startup, we were moving fast, and could not afford for build times to stretch into hours. This was a tricky problem.

The nature of our system lent itself well to docker compose, where we could bake each service into a docker image and use a compose file to orchestrate their communication. Docker compose proved extremely effective for local testing, as docker layer caching offered a huge benefit: Only components that changed needed to be rebuilt. It wasn't too problematic to translate each of our components into docker images that could be used to run the testing pipeline locally.

~~~{.yaml caption="compose.yaml"}
version: "3.8"
services:
konfig-api:
image: konfig-api:latest
ports:
- "8911:8911"
depends_on:
- konfig-db

konfig-python-formatter:
image: konfig-python-formatter:latest
ports:
- "10000:10000"

konfig-generator:
image: konfig-generator:latest
ports:
- "8080:8080"

konfig-db:
image: postgres
restart: always
ports:
- "5432:5432"
~~~

## A Simplified Version of our Docker Compose

However, the hard part was imitating this behavior in the cloud in order to create a fast and economical CI pipeline. Of course, we could just blindly port this docker-compose setup to Github actions, but then we would crucially lose the benefit of caching. When developing locally, docker uses your local disk to cache images in layers, allowing you to re-use these layers in subsequent builds if nothing has changed. Because there is no built-in persistent storage across CI builds, two builds running one after the other would each be forced to build all of the images from the ground up. One potential solution to this is to introduce a persistent storage that is pushed to and pulled from before and after each CI build; the problem with this is that our images are large -- in the realm of multiple GB apiece -- and the network cost of uploading/downloading these images would offset any gains from reusing them.

Thus, to avoid a network bottleneck, we need persistent storage that lives next to the compute instance (much like it does when running docker on your laptop!) To this end, we explored the nascent github actions cache feature, only to unfortunately find that it was limited to 10GB -- not enough space for our various large images.

So, to recap: In order to achieve CI with our desired speed, efficiency, and tooling, we were looking for a CI that wasn't too dissimilar from docker with a large amount of persistent storage living next to the execution environment. Luckily for us, that last sentence is a perfect characterization of Earthly.

![Img]({{site.images}}{{page.slug}}/img2.png)\

## The Solution: Earthly

When Konfig first found Earthly, I'll admit, I immediately liked the "Earth" branding; as a result, I really wanted Earthly to work out as the solution we were searching for.

After installing, it was easy to get our system ported into the Earthly ecosystem -- the Earthfile doesn't fall too far from the Dockerfile tree, and Earthly's integration with docker compose let me reuse the same `compose.yaml` file. The target syntax, which lets Earthfiles reference files and images from other targets in any Earthfile, was advantageous, as it let me clean up some duplicated resources between different images. (This is also possible with docker multi-stage builds, but Earthly made it even easier and more intuitive!)

~~~{.dockerfile caption="Earthfile"}
​test:
FROM earthly/dind:alpine-3.18-docker-23.0.6-r4
COPY compose/.env .
COPY compose/compose.yaml .
ARG TEST_ARGS
WITH DOCKER \
--compose compose.yaml \
--load konfig-python-formatter:latest=../konfig-python-formatter-server-blackd+run-python-formatter \
--load konfig-generator:latest=../konfig-generator-api+run-generator \
--load konfig-api:latest=../konfig-dash+run-konfig-api \
--load konfig-integration-tests:latest=+integration-tests
RUN docker run --network="konfig-network" konfig-integration-tests:latest $TEST_ARGS
END

~~~

## Very Simplified Test Against Docker-Compose Structure from One of our Earthfile

One favorable characteristic that I enjoyed was how I could build and experiment with Earthly locally, much as I would with docker. In fact, its use of docker under the hood made it easy to reuse familiar tools for local development and debugging, like docker desktop. Our Earthly pipeline essentially functions to build the images for each of our components, then use docker compose (integrated into Earthly) to orchestrate them in order to run our tests against them. Earthly builds targets in parallel, only waiting when required by interdependencies between targets, which made the builds quite speedy without requiring us to write any scripts.

Once everything was working locally, it was beyond simple to get Earthly running in the cloud -- launching an Earthly "satellite" took just a single CLI command. After that, everything stayed the same, except it wasn't running on my machine anymore -- although, the logs streaming back to my terminal in real-time could have fooled me. Satellites have large built-in caches, so the caching worked just as well in the cloud as it was on my laptop. The final step was just integrating the process into Github actions to trigger on pull requests, which once again could not have been easier: just a "Setup Earthly" action, followed by the same CLI command I'd been using to run Earthly locally.

I can now look back and say my presumptions were wrong, and building an efficient CI for Konfig was not so treacherous once we found Earthly, a tool seemingly built with our exact conundrum in mind. To this day, it never fails to make me happy seeing `*cached* *cached* *cached*` in our github action logs!

{% include_html cta/bottom-cta.html %}
Binary file added blog/assets/images/authors/eddie-chayes.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blog/assets/images/konfig/header.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blog/assets/images/konfig/img1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blog/assets/images/konfig/img2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blog/assets/images/konfig/img3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading