Skip to content

Commit

Permalink
Extend manifests to work properly with kubeadm (#22)
Browse files Browse the repository at this point in the history
* Add ServiceAccount to kube-router manifest

When initializing a local single-node Kubernetes
cluster with kubeadm, kube-router is not able to
access certaub resources and fails to start.
This commit adds a ServiceAccount and a
ClusterRoleBinding to the kube-router manifest.
This manifest comes from the official kube-router
repository:
(daemonset/kubeadm-kuberouter-all-features.yaml).

* Add checkpoint-rbac.yaml

This allows the kubectl checkpoint plugin to
create the container checkpoint.

* Run curl with sudo in kubectl-checkpoint

Previously, the curl command could not access
the kubelet's client certificate and key.

* Update README.md with kubeadm instructions

- Updated with specific instructions for kubeadm
- Added instruction to apply the RBAC manifest
- Updated the commands of step 9 to run as root
- Added a note that the local registry is optional

* Add comment to describe the purpose of checkpoint-rbac.yaml

* Replace system:node:codingbeast with generic name to fill in

* Update README to mention replacing machine name
  • Loading branch information
stano45 authored Aug 20, 2024
1 parent 998d1dc commit af59f69
Show file tree
Hide file tree
Showing 4 changed files with 262 additions and 106 deletions.
52 changes: 35 additions & 17 deletions examples/container_migration_in_kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ checkpointing feature in Kubernetes, please refer to the following pages:

## Running the example

1. Install CNI Plugins on each node
### 1. Install CNI Plugins on each node

The CNI configuration file is expected to be present as `/etc/cni/net.d/10-kuberouter.conf`
```
Expand All @@ -35,24 +35,24 @@ sudo mkdir -p /opt/cni/bin
sudo cp bin/* /opt/cni/bin/
```

2. Deploy daemonset
### 2. Initialize the Kubernetes cluster using kubeadm (optional):
```
kubectl apply -f manifests/kube-router-daemonset.yaml
sudo kubeadm init --pod-network-cidr=10.85.0.0/16 --cri-socket=unix:///var/run/crio/crio.sock
```

3. Setup a local container registry

### 3. Untaint the master node to allow pods to be scheduled (optional, assuming a single node cluster):
```
kubectl taint nodes --all node-role.kubernetes.io/master-
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
```
cd local-registry/
./generate-password.sh <user>
./generate-certificates.sh <hostname>
./trust-certificates.sh
./run.sh

buildah login <hostname>:5000

### 4. Deploy daemonset
```
kubectl apply -f manifests/kube-router-daemonset.yaml
```

3. Deploy an HTTP server
### 5. Deploy an HTTP server

```
kubectl apply -f manifests/http-server-deployment.yaml
Expand All @@ -65,33 +65,51 @@ kubectl get deployments
kubectl get service http-server
```

4. Install kubectl checkpoint plugin
### 6. Apply the RBAC configuration to allow the checkpoint plugin to create a checkpoint (optional if your config already allows this):
First, replace `<your_machine_name>` with the name of your machine. Then, run:
```
kubectl apply -f manifests/checkpoint-rbac.yaml
```

### 7. Setup a local container registry (optional, you can use any other registry)

```
cd local-registry/
./generate-password.sh <user>
./generate-certificates.sh <hostname>
./trust-certificates.sh
./run.sh
buildah login <hostname>:5000
```

### 8. Install the kubectl checkpoint plugin

```
sudo cp kubectl-plugin/kubectl-checkpoint /usr/local/bin/
```

5. Enable checkpoint/restore with established TCP connections
### 9. Enable checkpoint/restore with established TCP connections
```
sudo mkdir -p /etc/criu/
echo "tcp-established" | sudo tee -a /etc/criu/runc.conf
```

6. Create container checkpoint
### 10. Create container checkpoint

```
kubectl checkpoint <pod> <container>
```

7. Build a checkpoint OCI image and push to registry
### 11. Build a checkpoint OCI image and push to registry

```
build-image/build-image.sh -a <annotations-file> -c <checkpoint-path> -i <hostname>:5000/<image>:<tag>
buildah push <hostname>:5000/<image>:<tag>
```

7. Restore container from checkpoint image
### 12. Restore container from checkpoint image

Replace the container `image` filed in `http-server-deployment.yaml` with the
checkpoint OCI image `<hostname>:5000/<image>:<tag>` and apply the new deployment.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ elif [ "$#" -ne 3 ]; then
exit 1
fi

curl --insecure \
sudo curl --insecure \
--cert /var/lib/kubelet/pki/kubelet-client-current.pem \
--key /var/lib/kubelet/pki/kubelet-client-current.pem \
-X POST \
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# When running the `kubectl checkpoint` command, you may see the following error:
# Forbidden (user=system:node:<your_machine_name>, verb=create, resource=nodes, subresource=checkpoint)
# This ClusterRole allows the create verb on the nodes resource and the nodes/checkpoint subresource.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: checkpoint-role
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list", "get", "watch", "create"]
- apiGroups: [""]
resources: ["nodes/checkpoint"]
verbs: ["create"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: checkpoint-role-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: checkpoint-role
subjects:
- kind: User
# Replace <your_machine_name> with the name of your machine
name: system:node:<your_machine_name>
apiGroup: rbac.authorization.k8s.io

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kube-router-checkpoint-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: checkpoint-role
subjects:
- kind: ServiceAccount
name: kube-router
namespace: kube-system
Loading

0 comments on commit af59f69

Please sign in to comment.