This is a repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Kubernetes, Flux, Renovate and GitHub Actions.
This semi hyper-converged cluster operates on Talos Linux, an immutable and ephemeral Linux distribution tailored for Kubernetes, and is deployed on bare-metal MS-01 workstations. Rook supplies my workloads with persistent block, object, and file storage, while a separate server handles media file storage. The cluster is designed to enable a full teardown without any data loss.
There is a template at onedr0p/cluster-template if you want to follow along with some of the practices I use here.
- actions-runner-controller: Self-hosted Github runners.
- cert-manager: Creates SSL certificates for services in my cluster.
- cilium: Internal Kubernetes container networking interface.
- cloudflared: Enables Cloudflare secure access to my ingresses.
- external-dns: Automatically syncs ingress DNS records to a DNS provider.
- external-secrets: Managed Kubernetes secrets using 1Password Connect.
- ingress-nginx: Kubernetes ingress controller using NGINX as a reverse proxy and load balancer.
- multus: Multi-homed pod networking.
- rook: Distributed block storage for peristent storage.
- sops: Managed secrets for Kubernetes which are commited to Git.
- spegel: Stateless cluster local OCI registry mirror.
- tailscale: Private WireGuard based VPN.
- volsync: Backup and recovery of persistent volume claims.
Flux monitors my kubernetes folder (see Directories below) and implements changes to my cluster based on the YAML manifests.
Flux operates by recursively searching the kubernetes/apps folder until it locates the top-level kustomization.yaml
in each directory. It then applies all the resources listed in it. This kustomization.yaml
typically contains a namespace resource and one or more Flux kustomizations. These Flux kustomizations usually include a HelmRelease
or other application-related resources, which are then applied.
Renovate monitors my entire repository for dependency updates, automatically creating a PR when updates are found. When some PRs are merged, Flux applies the changes to my cluster.
This Git repository contains the following directories under kubernetes.
π kubernetes # Kubernetes cluster defined as code
ββπ apps # Apps deployed into my cluster grouped by namespace (see below)
ββπ bootstrap # Flux installation
ββπ flux # Main Flux configuration of repository
This is a high-level look how Flux deploys my applications with dependencies. Below there are 3 Flux kustomizations postgres
, postgres-cluster
, and atuin
. postgres
is the first app that needs to be running and healthy before postgres-cluster
and once postgres-cluster
is healthy atuin
will be deployed.
graph TD;
id1>Kustomization: cluster] -->|Creates| id2>Kustomization: cluster-apps];
id2>Kustomization: cluster-apps] -->|Creates| id3>Kustomization: postgres];
id2>Kustomization: cluster-apps] -->|Creates| id5>Kustomization: postgres-cluster]
id2>Kustomization: cluster-apps] -->|Creates| id8>Kustomization: atuin]
id3>Kustomization: postgres] -->|Creates| id4[HelmRelease: postgres];
id5>Kustomization: postgres-cluster] -->|Depends on| id3>Kustomization: postgres];
id5>Kustomization: postgres-cluster] -->|Creates| id10[Postgres Cluster];
id8>Kustomization: atuin] -->|Creates| id9(HelmRelease: atuin);
id8>Kustomization: atuin] -->|Depends on| id5>Kustomization: postgres-cluster];
I have two instances of external-dns
running in my cluster. The private DNS instance synchronizes DNS records with a UDM Pro Max
, while the public DNS instance does the same with Cloudflare
. This setup is managed by creating ingresses with specific ingress classes: internal
for the private DNS and external
for the public DNS. Both ingresses use the external-dns.alpha.kubernetes.io/target
annotation to specify the target. The external-dns
instances then syncs the DNS records to their respective platforms accordingly.
Device | Count | OS Disk Size | Data Disk Size | Ram | Operating System | Purpose |
---|---|---|---|---|---|---|
MS-01 (i9-13900H) | 3 | 1.92TB M.2 NVMe | 3.84TB U.2 NVMe (rook-ceph) | 96GB | Talos | Kubernetes |
USW Pro Max 24 PoE | 1 | - | - | - | UniFi OS | 2.5G PoE Switch |
USW Pro Aggregation | 1 | - | - | - | UniFi OS | 10G/25G Switch |
USP PDU Pro | 1 | - | - | - | UniFi OS | PDU |
UDM Pro Max | 1 | - | 2x16TB HDD | - | UniFi OS | Router & NVR |
Synology NAS RS1221+ | 1 | - | 8x22TB HDD | 32GB | - | NFS |
APC SMT15000RM2UNC | 1 | - | - | - | - | UPS |
TESmart 8 Port KVM Switch | 1 | - | - | - | - | KVM |
PiKVM (RasPi 4) | 1 | 64GB (SD) | - | 4GB | PiKVM (Arch) | KVM |
Many thanks to my friend @onedrop and all the fantastic people who donate their time to the Home Operations Discord community. Be sure to check out kubesearch.dev for ideas on how to deploy applications or get ideas on what you may deploy.
See the latest release notes.
See LICENSE.