Skip to content

Commit

Permalink
feat: K8S IaC - GitOps with ArgoCD (#93)
Browse files Browse the repository at this point in the history
* feat: enable helm chart inflation

* feat: add loki and promtail

* fix: some changes on the names - tags - azs

* feat: remove redis

* feat: move obsy from argo to kust

* feat: rename folder

* feat: try different configuration

* Connectivity working

* enable cni - argocd manage addonsg

* undo - to _

* cleanup commented code

* fix: change name to root

* fix: wrong name

* feat: pending code to push

* fix: app generation

* refactor: split backend.tf file

* feat: update apps

* feat: add comments - remove branch

* feat: add tooling app

* feat: split in folders

* feat: add branch

* fix: change target revision value in root app

* fix: update tools path

* fix: values file

* feat: add influxdb for tg-sidecar

* feat: fix influx

* try to use containerd sock

* undo containerd sock

* feat: upgrade versions - some tests

* some updates and fixes - cluster can be created now

* specify ns

* feat: add argocd values file - more nodes/masters

* feat: change ami zone to eu-west-1

* feat: add comments

* feat: add resources required when creating a cluster with kops

* feat: add efs helm chart installation

* feat: add context name

* feat: change tag

* feat: comment not create sa

* resources for efs

* feat: add rbac

* feat: remove efs roles

* fix: issues with EFS - update AMI

* feat: add nfs client installation to the nodes

* feat: test create sa from helm need permission cannot get resource leases in API group coordination.k8s.io

* remove comments

* temp commented

* feat: uncomment networking

* update registry

* fix: cleanup ds_store

* feat: add delete tf resources + cluster

* feat: create tf resoruces + add description

* feat: move ns to kustomization

* feat: comment influx observability out

* feat: add grafana resources

* feat: remove influxdb

* feat: add grafana crdsb

* feat: add olm + grafana opeartor

* feat: add ns

* cleanup comments

* test another version

* move resources

* add monitoring

* rename monitoring

* add ns observability

* add crds

* feat: redis change name

* feat: redis add ns

* feat: redis not specify ns

* feat: redis add url

* add permissions

* redis - bitnami

* add influxdb charts - bitnami

* add redis charts - bitnami

* fix prometheus

* add ns

* change service to redis

* change img

* fix: lb permissions issues

* fix: crds names

* fix: olm

* fix: prometheus admission controller

* add ns

* add system-tools

* add grafana-agent-operator

* add crd

* add grafana-agent crd - 2

* add crd grafana-dashboard

* add loki config

* add storage class

* feat: add multi master in 3 pols, rename cluster - rename tf resources

* fix: CNI to AWS-CNI

* fix: update telegraf-operator

* fix: move to default nsl

* fix: comment telegraf - manual installation

* fix: move telegraf to tools

* fix: install telegraf again

* refactor: cleanup not needed files

* add serversiderender to all

* fix: commete

* fix: values inline

* feat: use argocd app instead

* feat: rename app

* network resources updated

* deactivate telegraf

* activate telegraf

* activate networking

* refactor: cleanup

* use files for telegraf

* feat: add clusterautoscaler - tags

* feat: add clusterautoscaler - increase number of nodes

* feat: test versions

* feat: update comments

* feat: cleanup not needed files

* feat: update variables

* feat: cleanup not needed files

* remove comments

* docs: update readme + content

* fix: blank spaces

* feat: resources increased

* feat: multus resources increased

* feat: increase timeouts

* fix: solve comments in the PR

* feat: add more replicas to influxdb

* feat: update cidr - nodes - types

* feat: reduce number of instances on the infra pool

* feat: update to the latest version 900c23e

* feat: influxdb changes url

* feat: add autoscaler

* feat: update sync-service - increase timeouts

* feat: use latest commit for sync-service

* feat: update sync-service - increase timeouts

* feat: update resources - running 6k nodes

* feat: update resources

* feat: dont autosync

* feat: undo autosync

* feat: move observability to default ns

* feat: update redis svc

* feat: change references to observability ns

* feat: use testground org and specify the tag v0.7.0

* feat: remove comments - move to v1.23.4

* feat: reduce number of instances in tginfra nodepool

---------

Signed-off-by: Smuu <[email protected]>
Co-authored-by: Smuu <[email protected]>
Co-authored-by: Jose Ramon Mañes <[email protected]>
  • Loading branch information
3 people authored May 24, 2023
1 parent 6c104e5 commit 580da20
Show file tree
Hide file tree
Showing 567 changed files with 119,766 additions and 24,567 deletions.
12 changes: 12 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
root = true

[*]
charset = utf-8
end_of_line = lf
indent_size = 4
indent_style = space
insert_final_newline = true
trim_trailing_whitespace = true

[*.{yaml,yml,tf}]
indent_size = 2
5 changes: 5 additions & 0 deletions argocd-root/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- root.yaml
22 changes: 22 additions & 0 deletions argocd-root/root.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# We generate this ArgoCD application with Terraform, but we keep it here as a workaround
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root
spec:
project: default
source:
repoURL: https://github.com/testground/infra.git
path: 'argocd'
targetRevision: v0.7.0
destination:
name: in-cluster
namespace: argocd
syncPolicy:
automated:
prune: true
allowEmpty: true
selfHeal: true
syncOptions:
- ApplyOutOfSyncOnly=true
- CreateNamespace=true
34 changes: 34 additions & 0 deletions argocd/aws-applicationset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: aws
namespace: argocd
spec:
generators:
- git:
repoURL: https://github.com/testground/infra.git
revision: v0.7.0
directories:
- path: 'manifests/aws/*'
- path: manifests/aws/deactivated
exclude: true
template:
metadata:
name: 'aws-{{path[2]}}'
spec:
project: default
source:
repoURL: https://github.com/testground/infra.git
path: 'manifests/aws/{{path[2]}}'
targetRevision: v0.7.0
destination:
name: in-cluster
namespace: default
syncPolicy:
automated:
prune: true
allowEmpty: true
selfHeal: true
syncOptions:
- ApplyOutOfSyncOnly=true
- CreateNamespace=true
11 changes: 11 additions & 0 deletions argocd/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: argocd

resources:
- aws-applicationset.yaml
- observability-applicationset.yaml
- testground-applicationset.yaml
- networking-applicationset.yaml
- testground-tools-applicationset.yaml
35 changes: 35 additions & 0 deletions argocd/networking-applicationset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: networking
namespace: argocd
spec:
generators:
- git:
repoURL: https://github.com/testground/infra.git
revision: v0.7.0
directories:
- path: 'manifests/networking/*'
- path: manifests/networking/deactivated
exclude: true
template:
metadata:
name: 'networking-{{path[2]}}'
spec:
project: default
source:
repoURL: https://github.com/testground/infra.git
path: 'manifests/networking/{{path[2]}}'
targetRevision: v0.7.0
destination:
name: in-cluster
namespace: default
syncPolicy:
automated:
prune: true
allowEmpty: true
selfHeal: true
syncOptions:
- ApplyOutOfSyncOnly=true
- CreateNamespace=true
- ServerSideApply=true
35 changes: 35 additions & 0 deletions argocd/observability-applicationset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: observability
namespace: argocd
spec:
generators:
- git:
repoURL: https://github.com/testground/infra.git
revision: v0.7.0
directories:
- path: 'manifests/observability/*'
- path: manifests/observability/deactivated
exclude: true
template:
metadata:
name: 'observability-{{path[2]}}'
spec:
project: default
source:
repoURL: https://github.com/testground/infra.git
path: 'manifests/observability/{{path[2]}}'
targetRevision: v0.7.0
destination:
name: in-cluster
namespace: default
syncPolicy:
automated:
prune: true
allowEmpty: true
selfHeal: true
syncOptions:
- ApplyOutOfSyncOnly=true
- CreateNamespace=true
- ServerSideApply=true
41 changes: 41 additions & 0 deletions argocd/telegraf-operator.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: testground-tools-telegraf-operator
namespace: argocd
spec:
project: default
source:
chart: telegraf-operator
repoURL: https://helm.influxdata.com/
targetRevision: 1.3.11
helm:
releaseName: telegraf-operator
values: |
replicaCount: 2
classes:
data:
default: |
[[outputs.influxdb]]
urls = ["http://influxdb.default.svc:8086"]
database = "testground"
resources:
limits:
cpu: 400m
memory: 256Mi
requests:
cpu: 50m
memory: 64Mi
hotReload: true
destination:
namespace: default
name: in-cluster
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: true
syncOptions:
- ApplyOutOfSyncOnly=true
- CreateNamespace=true
- ServerSideApply=true
35 changes: 35 additions & 0 deletions argocd/testground-applicationset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: testground
namespace: argocd
spec:
generators:
- git:
repoURL: https://github.com/testground/infra.git
revision: v0.7.0
directories:
- path: 'manifests/testground/*'
- path: manifests/testground/deactivated
exclude: true
template:
metadata:
name: 'testground-{{path[2]}}'
spec:
project: default
source:
repoURL: https://github.com/testground/infra.git
path: 'manifests/testground/{{path[2]}}'
targetRevision: v0.7.0
destination:
name: in-cluster
namespace: default
syncPolicy:
automated:
prune: true
allowEmpty: true
selfHeal: true
syncOptions:
- ApplyOutOfSyncOnly=true
- CreateNamespace=true
- ServerSideApply=true
35 changes: 35 additions & 0 deletions argocd/testground-tools-applicationset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: testground-tools
namespace: argocd
spec:
generators:
- git:
repoURL: https://github.com/testground/infra.git
revision: v0.7.0
directories:
- path: 'manifests/tooling/*'
- path: manifests/tooling/deactivated
exclude: true
template:
metadata:
name: 'testground-tools-{{path[2]}}'
spec:
project: default
source:
repoURL: https://github.com/testground/infra.git
path: 'manifests/tooling/{{path[2]}}'
targetRevision: v0.7.0
destination:
name: in-cluster
namespace: default
syncPolicy:
automated:
prune: true
allowEmpty: true
selfHeal: true
syncOptions:
- ApplyOutOfSyncOnly=true
- CreateNamespace=true
- ServerSideApply=true
Binary file added docs/argocd_dashboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/argocd_root_app.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
34 changes: 17 additions & 17 deletions k8s/eks/bash/functions.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/bash

# error log prep check
prep_log_dir(){
prep_log_dir(){
mkdir -p $real_path/log/$start-log/
mkdir -p $real_path/.cluster/
}
Expand Down Expand Up @@ -319,7 +319,7 @@ aws_get_vpc_id(){
vpc_id=$(aws ec2 describe-vpcs --region $REGION --filters Name=tag:Name,Values=eksctl-$CLUSTER_NAME-cluster/VPC |jq -r ".Vpcs[] | .VpcId")
}

aws_get_subnet_id(){
aws_get_subnet_id(){
aws_get_vpc_id
concat_availability_zone
upper_az=$(echo $AVAILABILITY_ZONE | tr '[:lower:]' '[:upper:]' | tr -d \-)
Expand All @@ -334,7 +334,7 @@ aws_get_sg_id(){
aws_create_efs_mount_point(){
aws efs create-mount-target --file-system-id $efs_fs_id --subnet-id $subnet_id --security-group $efs_sg_id --region $REGION
efs_dns=$efs_fs_id.efs.$REGION.amazonaws.com

}

create_cm_efs(){
Expand Down Expand Up @@ -365,10 +365,10 @@ aws_create_ebs(){
echo -e "EBS created with this volume ID: $ebs_volume\n"
else
echo "EBS already exists, skipping to the next step."
fi
fi
}

make_persistent_volume(){
make_persistent_volume(){
export TG_EBS_DATADIR_VOLUME_ID=$ebs_volume

EBS_PV=$(mktemp)
Expand All @@ -378,11 +378,11 @@ make_persistent_volume(){

helm_redis_add_repo(){
helm repo add bitnami https://charts.bitnami.com/bitnami
}
}

helm_infra_install_redis(){
helm install testground-infra-redis --set auth.enabled=false --set master.nodeSelector='testground.node.role.infra: "true"' bitnami/redis
}
}

helm_infra_install_influx_db(){
# We are using v2.6.1 of the helm chart, which has been evicted from the regular index.yaml.
Expand All @@ -402,7 +402,7 @@ metadata:
data:
.env.toml: |
["aws"]
region = "$REGION"
region = "$REGION"
[runners."cluster:k8s"]
run_timeout_min = 15
Expand Down Expand Up @@ -506,26 +506,26 @@ log(){
echo "========================"
echo "Log file generated with name $start-$CLUSTER_NAME.tar.gz"
echo -e "\n"
rm -rf $real_path/log/$start-log/
rm -rf $real_path/log/$start-log/
}

##### Functions below are used by the 'testground_uninstall.sh' script #######

remove_efs_mp_timer(){
remove_efs_mp_timer(){
efs_mp_state=available # setting the start value for the loop to consider
sleep 15
while [[ $efs_mp_state == available ]];do
while [[ $efs_mp_state == available ]];do
efs_mp_state=$(aws efs describe-mount-targets --file-system-id $efs --region $region | jq -r ".MountTargets[] | .LifeCycleState")
sleep 1
done
done
}

remove_efs_fs_timer(){
remove_efs_fs_timer(){
efs_fs_state=available # setting the start value for the loop to consider
while [[ $efs_fs_state == available ]];do
while [[ $efs_fs_state == available ]];do
efs_fs_state=$(aws efs describe-file-systems --region $region --file-system-id $efs | jq -r ".FileSystems[] | .LifeCycleState")
sleep 1
done
done
}

obtain_efs_id(){
Expand Down Expand Up @@ -621,12 +621,12 @@ cleanup(){
else
echo -e "Looks like the EBS you have specified ($ebs) does not exist in the selected region ($region).\nIt is possible that it has already been deleted.\n"
fi

if [ "$efs_deleted" == "true" ] && [ "$ebs_deleted" == "true" ] && [ "$cluster_deleted" == "true" ]
then
rm -f $real_path/.cluster/$cluster_name-$region.cs
echo -e "Uninstall script completed and removed the '.cluster/$cluster_name-$region.cs' file.\n"
else
echo -e "Uninstall script completed, but did not remove the '.cluster/$cluster_name-$region.cs' file due to other resources not being deleted.\nPlease check the '.cluster/$cluster_name-$region.cs' file and try again.\n"
fi
}
}
Loading

0 comments on commit 580da20

Please sign in to comment.