Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s control plane high-availability mode #940

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/stargz_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ jobs:
kubectl patch configmap -n knative-serving config-autoscaler -p "{\"data\": {\"allow-zero-initial-scale\": \"true\"}}"

- name: Setup stock-only node
run: ./scripts/setup_tool setup_node stock-only use-stargz
run: ./scripts/setup_tool setup_node REGULAR stock-only use-stargz

- name: Check containerd service is running
run: sudo screen -list | grep "containerd"
Expand Down
11 changes: 11 additions & 0 deletions configs/k8s_ha/check_apiserver.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/sh

errorExit() {
echo "*** $*" 1>&2
exit 1
}

curl --silent --max-time 2 --insecure https://localhost:6443/ -o /dev/null || errorExit "Error GET https://localhost:6443/"
if ip addr | grep -q 10.0.1.254; then
curl --silent --max-time 2 --insecure https://10.0.1.254:6443/ -o /dev/null || errorExit "Error GET https://10.0.1.254:6443/"
fi
53 changes: 53 additions & 0 deletions configs/k8s_ha/haproxy.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log /dev/log local0
log /dev/log local1 notice
daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 1
timeout http-request 10s
timeout queue 20s
timeout connect 5s
timeout client 20s
timeout server 20s
timeout http-keep-alive 10s
timeout check 10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
bind *:6443
mode tcp
option tcplog
default_backend apiserverbackend

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserverbackend
option httpchk GET /healthz
http-check expect status 200
mode tcp
option ssl-hello-chk
balance roundrobin
server control_plane_1 10.0.1.1:6443 check
server control_plane_2 10.0.1.2:6443 check
server control_plane_3 10.0.1.3:6443 check
server control_plane_4 10.0.1.4:6443 check
server control_plane_5 10.0.1.5:6443 check
leokondrashov marked this conversation as resolved.
Show resolved Hide resolved
29 changes: 29 additions & 0 deletions configs/k8s_ha/keepalived_backup.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
weight -2
fall 10
rise 2
}

vrrp_instance VI_1 {
state BACKUP
interface $INTERFACE_NAME
virtual_router_id 51
priority 100
authentication {
auth_type PASS
auth_pass 42
}
virtual_ipaddress {
10.0.1.254
}
track_script {
check_apiserver
}
}
29 changes: 29 additions & 0 deletions configs/k8s_ha/keepalived_master.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
weight -2
fall 10
rise 2
}

vrrp_instance VI_1 {
state MASTER
interface $INTERFACE_NAME
virtual_router_id 51
priority 101
authentication {
auth_type PASS
auth_pass 42
}
virtual_ipaddress {
10.0.1.254
}
track_script {
check_apiserver
}
}
10 changes: 10 additions & 0 deletions configs/k8s_ha/substitute_interface.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/bash

readonly DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" > /dev/null 2>&1 && pwd)"

export INTERFACE_NAME=$(ifconfig | grep -B1 "10.0.1" | head -n1 | sed 's/:.*//')

cat $DIR/keepalived_master.conf | envsubst > $DIR/keepalived_master.conff
cat $DIR/keepalived_backup.conf | envsubst > $DIR/keepalived_backup.conff

echo "Successfully created HA load balancer configuration!"
5 changes: 4 additions & 1 deletion configs/setup/kube.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
"PodNetworkCidr": "192.168.0.0/16",
"ApiserverPort": "6443",
"ApiserverToken": "",
"ApiserverTokenHash": "",
"ApiserverDiscoveryToken": "",
"ApiserverCertificateKey": "",
"CPHAEndpoint": "10.0.1.254",
leokondrashov marked this conversation as resolved.
Show resolved Hide resolved
"CPHAPort": "6443",
"CalicoVersion": "3.27.2"
}
6 changes: 3 additions & 3 deletions docs/developers_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ cd vhive
./scripts/install_go.sh; source /etc/profile # or install Go manually
pushd scripts && go build -o setup_tool && popd && mv scripts/setup_tool .

./setup_tool setup_node [stock-only|gvisor|firecracker]
./setup_tool setup_node REGULAR [stock-only|gvisor|firecracker]
sudo containerd
./setup_tool create_one_node_cluster [stock-only|gvisor|firecracker]
# wait for the containers to boot up using
Expand Down Expand Up @@ -115,7 +115,7 @@ Assuming you rented a node using the vHive CloudLab profile:
1. Setup the node for the desired sandbox:

```bash
./setup_tool setup_node [firecracker|gvisor]
./setup_tool setup_node REGULAR [firecracker|gvisor]
```

2. Setup the CRI test environment for the desired sandbox:
Expand Down Expand Up @@ -240,7 +240,7 @@ Knative functions can use GPU although only `stock-only` mode is supported.
Follow the guide to [setup stock knative](#testing-stock-knative-setup-or-images).

``` bash
./setup_tool setup_node stock-only
./setup_tool setup_node REGULAR stock-only
```

### Install NVIDIA Driver and NVIDIA Container Toolkit
Expand Down
2 changes: 1 addition & 1 deletion docs/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ We present how to set up a multi-node cluster, however, the same modifications c

3. Run the node setup script:
```bash
./setup_tool setup_node
./setup_tool setup_node REGULAR
```
> **BEWARE:**
>
Expand Down
8 changes: 4 additions & 4 deletions docs/quickstart_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ Another option is to install using official instructions: [https://golang.org/do
> flags as follows:
>
> ```bash
> ./setup_tool setup_node stock-only use-stargz
> ./setup_tool setup_node REGULAR stock-only use-stargz
> ```
> **IMPORTANT**
> Currently `stargz` is only supported in native kubelet contexts without firecracker.
Expand All @@ -103,7 +103,7 @@ Another option is to install using official instructions: [https://golang.org/do

For the standard setup, run the following script:
```bash
./setup_tool setup_node firecracker
./setup_tool setup_node REGULAR firecracker
```
> **BEWARE:**
>
Expand Down Expand Up @@ -250,13 +250,13 @@ In essence, you will execute the same commands for master and worker setups but
Execute the following below **as a non-root user with sudo rights** using **bash**:
1. Run the node setup script:
```bash
./setup_tool setup_node firecracker
./setup_tool setup_node REGULAR firecracker
```
> **Note:**
> To enable runs with `stargz` images, setup kubelet by adding the `stock-only` and `use-stargz`
> flags as follows:
> ```bash
> ./setup_tool setup_node stock-only use-stargz
> ./setup_tool setup_node REGULAR stock-only use-stargz
> ```
> **IMPORTANT**
> Currently `stargz` is only supported in native kubelet contexts without firecracker.
Expand Down
Loading
Loading