Skip to content

Commit

Permalink
Document vcsim and vm-operator
Browse files Browse the repository at this point in the history
  • Loading branch information
fabriziopandini committed Mar 15, 2024
1 parent 374c3e3 commit 37f2f10
Show file tree
Hide file tree
Showing 6 changed files with 272 additions and 2 deletions.
156 changes: 154 additions & 2 deletions test/infrastructure/vcsim/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,156 @@
# vcsim controller

vcsim controller provides one or more vcsim instances, as well as the fake API server / etcd running on the
simulated machines.
vcsim controller is a controller that provides one or more vcsim instances, as well as the VIP for control
plane endpoints.

## Architecture

vcsim controller is a regular Kubernetes controller designed to be run aside to CAPV when you are planning to use vcsim
as a target infrastructure instead of a real vCenter.

It is also worth to notice that vcsim controller leverage several components from Cluster API's in-memory
provider, and thus it is recommended to become familiar with this [document](https://github.com/kubernetes-sigs/cluster-api/blob/main/test/infrastructure/inmemory/README.md)
before reading following paragraphs.

### Preparing up a test environment (using vcsim)

In order to understand the architecture of the vcsim controller, it is convenient to start looking only at
components that are involved in setting up a test environment that will use vcsim as a target infrastructure.

A test environment for CAPV requires two main elements:

- A vCenter, or in this case a vcsim instance simulating a vCenter.
- A VIP for the control plane endpoint

In order to create a vcsim instance it is possible to use the `VCenterSimulator` resource, and the corresponding
VCenterSimulatorReconciler will take care of the provisioning process.

Note: The vcsim instance will run inside the Pod that host the vcsim controller, and be accessible trough a port that
will surface in the status of the `VCenterSimulator` resource.

Note: As of today, given a limitation in the vcsim library, only a single instance vcsim instance can be active at any
time.

In order to get a VIP for the control plane endpoint it is possible to use the `ControlPlaneEndpoint` resource,
and the corresponding ControlPlaneEndpointReconciler will take care of the provisioning process.

Note: The code for CAPI's in memory provider is used to create VIPs; they ar implemented as a listener on a port
in the range 20000-24000 on the vcsim controller Pod. The port will surface in the status of the `ControlPlaneEndpoint`
resource.

![Architecture](architecture-part1.drawio.svg)

The vcsim controller also implement two additional CRDs:

- The `EnvSubst` CRD, that can be used to generate EnvSubst variables to be used with cluster templates.
Each `EnvSubst` generates variables for a single workload cluster, using the VIP from a `ControlPlaneEndpoint` resource
and targeting a vCenter that can be originated by a `VCenterSimulator`.

- The `VMOperatorDependencies` CRD, that is explained in the [vm-operator](../vm-operator/README.md) documentation.

Please also note that the [vcsim.sh](scripts/vcsim.sh) script provide a simplified way for creating `VCenterSimulator`,
`ControlPlaneEndpoint`, a `EnvSubst` referencing both, and finally copy all the variables to a .env file.
With the env file, that it is finally possible to create a cluster.

```shell
# vcsim1 is the name of the vcsim instance to be created
# cluster1 is the name of the workload cluster to be created (and it used also as a name for the ControlPlaneEndpoint resource)
$ test/infrastructure/vcsim/scripts/vcsim.sh vcsim1 cluster1
created VCenterSimulator vcsim1
created ControlPlaneEndpoint cluster1
envvar.vcsim.infrastructure.cluster.x-k8s.io/cluster1 created
created EnvVar cluster1
done!
GOVC_URL=https://user:[email protected]:36401/sdk

source vcsim.env

# After this command completes you have to run the printed source command
$ source vcsim.env

# Then you are ready to create a workload cluster
$ cat <your template> | envsubst | kubectl apply -f -
```

### Cluster provisioning with vcsim

In the previous paragraph we explained the process and the components to setup a the test environment with a
`VCenterSimulator` and a `ControlPlaneEndpoint`, and the create a cluster.

In this paragraph we are going to describe the components of the vcsim controller that oversee the actual provisioning
of the cluster.

The picture below explains how this works in detail:

![Architecture](architecture-part2.drawio.svg)

- When the cluster API controllers (KCP, MD controllers) creates a `VSphereMachine`, the CAPV controllers first create
a `VSphereVM` resource.
- The, when reconciling the `VSphereVM` CAPV controllers, use the govmomi library to connect to vCenter (in this case to vcsim)
to provision a VM.

Given that VM provisioned by vcsim are fake, there won't be cloud-init running on the machine, and thus it is responsibility
of the vmBootstrapReconciler component inside the vcsim controller to "mimic" the machine bootstrap process:
NOTE: This process is implemented using the code for CAPI's in memory provider.

- If the machine is a control plane machine, fake static pods are created for API server, etcd and other control plane
components (controller manager and scheduler are omitted from the picture for sake of simplicity).
- API server fake pods are registered as "backend" for the cluster VIP, as well as etcd fake pods are registered as
"members" of a fake etcd cluster.

Cluster VIP, fake API server pods and fake Etcd pods provide a minimal fake Kubernetes control plane, just capable of the
operations required to trick Cluster API in believing a real K8s cluster is there.

Finally, in order to complete the provisioning of the fake VM, it is necessary to assign an IP to it. This task is
performed by the vmIPReconciler component inside the vcsim controller.

## Working with vcsim

### Tilt

vcsim can be used with Tilt for local development.

To use vcsim it is required to add it the list of enabled providers in your `tilt-setting.yaml/json`; you can also
provide extra args or enable debugging for this provider e.g.

```yaml
...
provider_repos:
- ../cluster-api-provider-vsphere
enable_providers:
- kubeadm-bootstrap
- kubeadm-control-plane
- vsphere
- vcsim
extra_args:
vcsim:
- "--v=2"
- "--logging-format=json"
debug:
vcsim:
continue: true
port: 30040
...
```

Note: vcsim is not a Cluster API provider, however, for sake of convenience we are "disguising" it as a Cluster API
runtime extension, so it is possible to deploy it leveraging on the existing machinery in Tilt (as well as in E2E tests).

After starting tilt with `make tilt-up`, you can use the [vcsim.sh](scripts/vcsim.sh) script and the instruction above
to create a test cluster.

See [Developing Cluster API with Tilt](https://cluster-api.sigs.k8s.io/developer/tilt) for more details.

### E2E tests

vsim can be used to run a subset of CAPV E2E tests that can be executed by setting `GINKGO_FOCUS="\[vcsim\]"`.

See [Running the end-to-end tests locally](https://cluster-api.sigs.k8s.io/developer/testing#running-the-end-to-end-tests-locally) for more details.

Note: The code for the E2E test setup will take care of creating the `VCenterSimulator`, the `ControlPlaneEndpoint`
and to grab required variables from the corresponding `EnvSubst`.

### Clusterctl

Even if technically possible, we are not providing an official way to run vcsim controller using clusterctl,
and as of today there are no plans to do so.
4 changes: 4 additions & 0 deletions test/infrastructure/vcsim/architecture-part1.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions test/infrastructure/vcsim/architecture-part2.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
102 changes: 102 additions & 0 deletions test/infrastructure/vm-operator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# vm-operator

In this folder we are maintaining code for building the vm-operator manifest.

vm-operator is a component of vCenter supervisor.
CAPV, when running in supervisor mode delegates to the vm-operator the responsibility to create and manage VMs.

**NOTE:** The vm-operator manifest in this folder and everything else described in this page is **not** designed for
production use and is intended for CAPV development and test only.

## "limited version of the supervisor"

This project has the requirement to test CAPV in supervisor mode using all the supported versions of
CAPI, CAPV and vCenter supervisor, and also for all the version built from open PRs.

In order to achieve this without incurring on the cost/complexity of creating multiple, ad-hoc vCenter distributions,
we are using a "limited version of the supervisor", composed by the vm-operator only.

This "limited version of the supervisor" is considered enough to provide a signal for CAPV development and test;
however, due to the necessary trade-offs required to get a simple and cheap test environment, the solution described below
is not fit to use for other use cases.

The picture explains how this works in detail:

![Architecture](architecture-part1.drawio.svg)

As you might notice, it is required to have an additional component taking care of setting up the management cluster
and vCenter as required by the vm-operator. This component exist in different variants according to the use cases
described in following paragraphs.

## Building and pushing the VM-operator manifest

Run `make release-vm-operator` to build vm-operator, manifests, and image to the VSphere staging bucket.

Note: we are maintaining a copy of those artefacts to ensure CAPV test isolation and to allow small customizations
that makes it easier to run the vm-operator in the "limited version of the supervisor", but this might change in future.

## Tilt for CAPV in supervisor mode using vcsim

NOTE: As of today we are not supporting Tilt development of CAPV in supervisor mode when targeting a real vCenter.

Before reading this paragraph, please familiarize with [vcsim](../vcsim/README.md) documentation.

To use vsphere in supervisor mode it is required to add it the list of enabled providers in your `tilt-setting.yaml/json`
(note that we are adding `vsphere-supervisor`, which is a variant that deployed the supervisor's CRDs);
in this case, it is also require to add both the vm-operator and vcsim.

```yaml
...
provider_repos:
- ../cluster-api-provider-vsphere
enable_providers:
- kubeadm-bootstrap
- kubeadm-control-plane
- vsphere-supervisor
- vm-operator
- vcsim
extra_args:
vcsim:
- "--v=2"
- "--logging-format=json"
debug:
vcsim:
continue: true
port: 30040
...
```

Note: the default configuration does not allow to debug the vm-operator.

While starting tilt, the vcsim controller will also automatically setup the `default` namespace with
all the pre requisites for the vm-operator to reconcile machines created in it.

If there is the need to create machines in different namespace, it is required to create manually
`VMOperatorDependencies` resource to instruct the vcsim controller to setup additional namespaces too.

The following image summarizes all the moving part involved in this scenario.

![Architecture](architecture-part2.drawio.svg)

## E2E tests for CAPV in supervisor mode

A subset of CAPV E2E tests can be executed using the supervisor mode by setting `GINKGO_FOCUS="\[supervisor\]"`.

See [Running the end-to-end tests locally](https://cluster-api.sigs.k8s.io/developer/testing#running-the-end-to-end-tests-locally) for more details.

Note: The code responsible for E2E tests setup will take care of ensuring the management cluster
and vCenter have all the dependencies required by the vm-operator; The only exception is the Content Library with
Machine templates, that must be created before running the tests.

Note: the operation above (ensure vm-operator dependencies) considers the values provided via
env variables or via test config variables, thus making it possible to run E2E test on the VMC instance used
for vSphere CI as well as on any other vCenter.

## E2E tests for CAPV in supervisor mode using vcsim

A subset of CAPV E2E tests can be executed using the supervisor mode and vcsim as a target infrastructure by setting
`GINKGO_FOCUS="\[vcsim\]\s+\[supervisor\]"`.

Note: The code responsible for E2E tests setup will take care of creating the `VCenterSimulator`, the `ControlPlaneEndpoint`
and to grab required variables from the corresponding `EnvSubst`. On top of that, the setup code will also
create the required `VMOperatorDependencies` resource for configuring the test namespace.
4 changes: 4 additions & 0 deletions test/infrastructure/vm-operator/architecture-part1.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions test/infrastructure/vm-operator/architecture-part2.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 37f2f10

Please sign in to comment.