From 37f2f1017e3c9b69c5d70568629115e9cbdccd22 Mon Sep 17 00:00:00 2001 From: fabriziopandini Date: Fri, 15 Mar 2024 15:10:20 +0100 Subject: [PATCH] Document vcsim and vm-operator --- test/infrastructure/vcsim/README.md | 156 +++++++++++++++++- .../vcsim/architecture-part1.drawio.svg | 4 + .../vcsim/architecture-part2.drawio.svg | 4 + test/infrastructure/vm-operator/README.md | 102 ++++++++++++ .../vm-operator/architecture-part1.drawio.svg | 4 + .../vm-operator/architecture-part2.drawio.svg | 4 + 6 files changed, 272 insertions(+), 2 deletions(-) create mode 100644 test/infrastructure/vcsim/architecture-part1.drawio.svg create mode 100644 test/infrastructure/vcsim/architecture-part2.drawio.svg create mode 100644 test/infrastructure/vm-operator/README.md create mode 100644 test/infrastructure/vm-operator/architecture-part1.drawio.svg create mode 100644 test/infrastructure/vm-operator/architecture-part2.drawio.svg diff --git a/test/infrastructure/vcsim/README.md b/test/infrastructure/vcsim/README.md index 97dbd16e62..188d8a1c5e 100644 --- a/test/infrastructure/vcsim/README.md +++ b/test/infrastructure/vcsim/README.md @@ -1,4 +1,156 @@ # vcsim controller -vcsim controller provides one or more vcsim instances, as well as the fake API server / etcd running on the -simulated machines. +vcsim controller is a controller that provides one or more vcsim instances, as well as the VIP for control +plane endpoints. + +## Architecture + +vcsim controller is a regular Kubernetes controller designed to be run aside to CAPV when you are planning to use vcsim +as a target infrastructure instead of a real vCenter. + +It is also worth to notice that vcsim controller leverage several components from Cluster API's in-memory +provider, and thus it is recommended to become familiar with this [document](https://github.com/kubernetes-sigs/cluster-api/blob/main/test/infrastructure/inmemory/README.md) +before reading following paragraphs. + +### Preparing up a test environment (using vcsim) + +In order to understand the architecture of the vcsim controller, it is convenient to start looking only at +components that are involved in setting up a test environment that will use vcsim as a target infrastructure. + +A test environment for CAPV requires two main elements: + +- A vCenter, or in this case a vcsim instance simulating a vCenter. +- A VIP for the control plane endpoint + +In order to create a vcsim instance it is possible to use the `VCenterSimulator` resource, and the corresponding +VCenterSimulatorReconciler will take care of the provisioning process. + +Note: The vcsim instance will run inside the Pod that host the vcsim controller, and be accessible trough a port that +will surface in the status of the `VCenterSimulator` resource. + +Note: As of today, given a limitation in the vcsim library, only a single instance vcsim instance can be active at any +time. + +In order to get a VIP for the control plane endpoint it is possible to use the `ControlPlaneEndpoint` resource, +and the corresponding ControlPlaneEndpointReconciler will take care of the provisioning process. + +Note: The code for CAPI's in memory provider is used to create VIPs; they ar implemented as a listener on a port +in the range 20000-24000 on the vcsim controller Pod. The port will surface in the status of the `ControlPlaneEndpoint` +resource. + +![Architecture](architecture-part1.drawio.svg) + +The vcsim controller also implement two additional CRDs: + +- The `EnvSubst` CRD, that can be used to generate EnvSubst variables to be used with cluster templates. + Each `EnvSubst` generates variables for a single workload cluster, using the VIP from a `ControlPlaneEndpoint` resource + and targeting a vCenter that can be originated by a `VCenterSimulator`. + +- The `VMOperatorDependencies` CRD, that is explained in the [vm-operator](../vm-operator/README.md) documentation. + +Please also note that the [vcsim.sh](scripts/vcsim.sh) script provide a simplified way for creating `VCenterSimulator`, +`ControlPlaneEndpoint`, a `EnvSubst` referencing both, and finally copy all the variables to a .env file. +With the env file, that it is finally possible to create a cluster. + +```shell +# vcsim1 is the name of the vcsim instance to be created +# cluster1 is the name of the workload cluster to be created (and it used also as a name for the ControlPlaneEndpoint resource) +$ test/infrastructure/vcsim/scripts/vcsim.sh vcsim1 cluster1 +created VCenterSimulator vcsim1 +created ControlPlaneEndpoint cluster1 +envvar.vcsim.infrastructure.cluster.x-k8s.io/cluster1 created +created EnvVar cluster1 +done! +GOVC_URL=https://user:pass@127.0.0.1:36401/sdk + +source vcsim.env + +# After this command completes you have to run the printed source command +$ source vcsim.env + +# Then you are ready to create a workload cluster +$ cat | envsubst | kubectl apply -f - +``` + +### Cluster provisioning with vcsim + +In the previous paragraph we explained the process and the components to setup a the test environment with a +`VCenterSimulator` and a `ControlPlaneEndpoint`, and the create a cluster. + +In this paragraph we are going to describe the components of the vcsim controller that oversee the actual provisioning +of the cluster. + +The picture below explains how this works in detail: + +![Architecture](architecture-part2.drawio.svg) + +- When the cluster API controllers (KCP, MD controllers) creates a `VSphereMachine`, the CAPV controllers first create + a `VSphereVM` resource. +- The, when reconciling the `VSphereVM` CAPV controllers, use the govmomi library to connect to vCenter (in this case to vcsim) + to provision a VM. + +Given that VM provisioned by vcsim are fake, there won't be cloud-init running on the machine, and thus it is responsibility +of the vmBootstrapReconciler component inside the vcsim controller to "mimic" the machine bootstrap process: +NOTE: This process is implemented using the code for CAPI's in memory provider. + +- If the machine is a control plane machine, fake static pods are created for API server, etcd and other control plane + components (controller manager and scheduler are omitted from the picture for sake of simplicity). +- API server fake pods are registered as "backend" for the cluster VIP, as well as etcd fake pods are registered as + "members" of a fake etcd cluster. + +Cluster VIP, fake API server pods and fake Etcd pods provide a minimal fake Kubernetes control plane, just capable of the +operations required to trick Cluster API in believing a real K8s cluster is there. + +Finally, in order to complete the provisioning of the fake VM, it is necessary to assign an IP to it. This task is +performed by the vmIPReconciler component inside the vcsim controller. + +## Working with vcsim + +### Tilt + +vcsim can be used with Tilt for local development. + +To use vcsim it is required to add it the list of enabled providers in your `tilt-setting.yaml/json`; you can also +provide extra args or enable debugging for this provider e.g. + +```yaml +... +provider_repos: + - ../cluster-api-provider-vsphere +enable_providers: + - kubeadm-bootstrap + - kubeadm-control-plane + - vsphere + - vcsim +extra_args: + vcsim: + - "--v=2" + - "--logging-format=json" +debug: + vcsim: + continue: true + port: 30040 +... +``` + +Note: vcsim is not a Cluster API provider, however, for sake of convenience we are "disguising" it as a Cluster API +runtime extension, so it is possible to deploy it leveraging on the existing machinery in Tilt (as well as in E2E tests). + +After starting tilt with `make tilt-up`, you can use the [vcsim.sh](scripts/vcsim.sh) script and the instruction above +to create a test cluster. + +See [Developing Cluster API with Tilt](https://cluster-api.sigs.k8s.io/developer/tilt) for more details. + +### E2E tests + +vsim can be used to run a subset of CAPV E2E tests that can be executed by setting `GINKGO_FOCUS="\[vcsim\]"`. + +See [Running the end-to-end tests locally](https://cluster-api.sigs.k8s.io/developer/testing#running-the-end-to-end-tests-locally) for more details. + +Note: The code for the E2E test setup will take care of creating the `VCenterSimulator`, the `ControlPlaneEndpoint` +and to grab required variables from the corresponding `EnvSubst`. + +### Clusterctl + +Even if technically possible, we are not providing an official way to run vcsim controller using clusterctl, +and as of today there are no plans to do so. diff --git a/test/infrastructure/vcsim/architecture-part1.drawio.svg b/test/infrastructure/vcsim/architecture-part1.drawio.svg new file mode 100644 index 0000000000..a868c4778b --- /dev/null +++ b/test/infrastructure/vcsim/architecture-part1.drawio.svg @@ -0,0 +1,4 @@ + + + +
InMemoryCluster
Reconciler
VIP
InMemory
Clusters
vcsim controller
watch
create
vCenterSimulator
reconciler
vCenterSimulators
create
vcsim instance
watch
\ No newline at end of file diff --git a/test/infrastructure/vcsim/architecture-part2.drawio.svg b/test/infrastructure/vcsim/architecture-part2.drawio.svg new file mode 100644 index 0000000000..b3aacf3a12 --- /dev/null +++ b/test/infrastructure/vcsim/architecture-part2.drawio.svg @@ -0,0 +1,4 @@ + + + +
CAPI
controllers
InMemory
API Server Pod
VIP
InMemory
Node
kubeadm's
config maps etc.
vmBootstrap
reconciler
vcsim VM
KCPs
MDs
VSphere
Machines
InMemory
etcd Pod
vcsim controller
InMemory fake objects
watch
watch
create
create
CAPV
controllers
VSphere
VMs
create
vmIP
reconciler
vcsim instance
watch
create
(interacts using the govmomi library)
\ No newline at end of file diff --git a/test/infrastructure/vm-operator/README.md b/test/infrastructure/vm-operator/README.md new file mode 100644 index 0000000000..5b6765bebd --- /dev/null +++ b/test/infrastructure/vm-operator/README.md @@ -0,0 +1,102 @@ +# vm-operator + +In this folder we are maintaining code for building the vm-operator manifest. + +vm-operator is a component of vCenter supervisor. +CAPV, when running in supervisor mode delegates to the vm-operator the responsibility to create and manage VMs. + +**NOTE:** The vm-operator manifest in this folder and everything else described in this page is **not** designed for +production use and is intended for CAPV development and test only. + +## "limited version of the supervisor" + +This project has the requirement to test CAPV in supervisor mode using all the supported versions of +CAPI, CAPV and vCenter supervisor, and also for all the version built from open PRs. + +In order to achieve this without incurring on the cost/complexity of creating multiple, ad-hoc vCenter distributions, +we are using a "limited version of the supervisor", composed by the vm-operator only. + +This "limited version of the supervisor" is considered enough to provide a signal for CAPV development and test; +however, due to the necessary trade-offs required to get a simple and cheap test environment, the solution described below +is not fit to use for other use cases. + +The picture explains how this works in detail: + +![Architecture](architecture-part1.drawio.svg) + +As you might notice, it is required to have an additional component taking care of setting up the management cluster +and vCenter as required by the vm-operator. This component exist in different variants according to the use cases +described in following paragraphs. + +## Building and pushing the VM-operator manifest + +Run `make release-vm-operator` to build vm-operator, manifests, and image to the VSphere staging bucket. + +Note: we are maintaining a copy of those artefacts to ensure CAPV test isolation and to allow small customizations +that makes it easier to run the vm-operator in the "limited version of the supervisor", but this might change in future. + +## Tilt for CAPV in supervisor mode using vcsim + +NOTE: As of today we are not supporting Tilt development of CAPV in supervisor mode when targeting a real vCenter. + +Before reading this paragraph, please familiarize with [vcsim](../vcsim/README.md) documentation. + +To use vsphere in supervisor mode it is required to add it the list of enabled providers in your `tilt-setting.yaml/json` +(note that we are adding `vsphere-supervisor`, which is a variant that deployed the supervisor's CRDs); +in this case, it is also require to add both the vm-operator and vcsim. + +```yaml +... +provider_repos: + - ../cluster-api-provider-vsphere +enable_providers: + - kubeadm-bootstrap + - kubeadm-control-plane + - vsphere-supervisor + - vm-operator + - vcsim +extra_args: + vcsim: + - "--v=2" + - "--logging-format=json" +debug: + vcsim: + continue: true + port: 30040 +... +``` + +Note: the default configuration does not allow to debug the vm-operator. + +While starting tilt, the vcsim controller will also automatically setup the `default` namespace with +all the pre requisites for the vm-operator to reconcile machines created in it. + +If there is the need to create machines in different namespace, it is required to create manually +`VMOperatorDependencies` resource to instruct the vcsim controller to setup additional namespaces too. + +The following image summarizes all the moving part involved in this scenario. + +![Architecture](architecture-part2.drawio.svg) + +## E2E tests for CAPV in supervisor mode + +A subset of CAPV E2E tests can be executed using the supervisor mode by setting `GINKGO_FOCUS="\[supervisor\]"`. + +See [Running the end-to-end tests locally](https://cluster-api.sigs.k8s.io/developer/testing#running-the-end-to-end-tests-locally) for more details. + +Note: The code responsible for E2E tests setup will take care of ensuring the management cluster +and vCenter have all the dependencies required by the vm-operator; The only exception is the Content Library with +Machine templates, that must be created before running the tests. + +Note: the operation above (ensure vm-operator dependencies) considers the values provided via +env variables or via test config variables, thus making it possible to run E2E test on the VMC instance used +for vSphere CI as well as on any other vCenter. + +## E2E tests for CAPV in supervisor mode using vcsim + +A subset of CAPV E2E tests can be executed using the supervisor mode and vcsim as a target infrastructure by setting +`GINKGO_FOCUS="\[vcsim\]\s+\[supervisor\]"`. + +Note: The code responsible for E2E tests setup will take care of creating the `VCenterSimulator`, the `ControlPlaneEndpoint` +and to grab required variables from the corresponding `EnvSubst`. On top of that, the setup code will also +create the required `VMOperatorDependencies` resource for configuring the test namespace. diff --git a/test/infrastructure/vm-operator/architecture-part1.drawio.svg b/test/infrastructure/vm-operator/architecture-part1.drawio.svg new file mode 100644 index 0000000000..f5dd947f5b --- /dev/null +++ b/test/infrastructure/vm-operator/architecture-part1.drawio.svg @@ -0,0 +1,4 @@ + + + +
CAPI
controllers
KCPs
MDs
VSphere
Machines
watch
watch
create
CAPV
controllers
Virtual
Machines
create
vm-operator
watch
vCenter
Setup Prerequisites
\ No newline at end of file diff --git a/test/infrastructure/vm-operator/architecture-part2.drawio.svg b/test/infrastructure/vm-operator/architecture-part2.drawio.svg new file mode 100644 index 0000000000..950aac7651 --- /dev/null +++ b/test/infrastructure/vm-operator/architecture-part2.drawio.svg @@ -0,0 +1,4 @@ + + + +
CAPI
controllers
InMemoryCluster
Reconciler
InMemory
API Server Pod
VIP
InMemory
Node
kubeadm's
config maps etc.
vmBootstrap
reconciler
vcsim VM
KCPs
MDs
VSphere
Machines
InMemory
Clusters
InMemory
etcd Pod
vcsim controller
InMemory fake objects
watch
watch
watch
create
create
create
CAPV
controllers
Virtual
Machines
create
vmIP
reconciler
vCenterSimulator
reconciler
vCenterSimulators
create
vcsim instance
watch
watch
create
vm-operator
VMOperator
Dependencies
setup vm-operator pre requisites in vCenter and in the Management Cluster
\ No newline at end of file