From 9b4bd09556064fc776787b3fd02ad353c38b4820 Mon Sep 17 00:00:00 2001 From: fabriziopandini Date: Wed, 25 Sep 2024 12:48:36 +0200 Subject: [PATCH] Refactor InfraMachine contract --- .../src/developer/core/controllers/machine.md | 152 +---- .../providers/contracts/infra-cluster.md | 6 +- .../providers/contracts/infra-machine.md | 624 +++++++++++++----- .../providers/getting-started/webhooks.md | 22 - 4 files changed, 513 insertions(+), 291 deletions(-) diff --git a/docs/book/src/developer/core/controllers/machine.md b/docs/book/src/developer/core/controllers/machine.md index bd318a381593..279828379eb8 100644 --- a/docs/book/src/developer/core/controllers/machine.md +++ b/docs/book/src/developer/core/controllers/machine.md @@ -1,20 +1,39 @@ -# Machine Controller +# Machine Controller -![](../../../images/cluster-admission-machine-controller.png) +The Machine controller is responsible for reconciling the Machine resource. + +In order to allow Machine provisioning on different type of infrastructure, The Machine resource references +an Machine object, e.g. AWSMachine, GCMachine etc. + +The [InfraMachine resource contract](../../providers/contracts/infra-machine.md) defines a set of rules a provider is expected to comply in order to allow +the expected interactions with the Machine controller. + +Among those rules: +- InfraMachine MUST report a [provider ID](../../providers/contracts/infra-machine.md#inframachine-provider-id) for the Machine +- InfraMachine SHOULD define a [failure domain](../../providers/contracts/infra-machine.md#inframachine-failure-domain) where machines should be placed in +- InfraMachine SHOULD surface machine's [addresses](../../providers/contracts/infra-machine.md#inframachine-addresses) to help operators when troubleshooting issues +- InfraMachine MUST report when Cluster's infrastructure is [fully provisioned](../../providers/contracts/infra-machine.md#inframachine-initialization-completed) +- InfraMachine SHOULD report [conditions](../../providers/contracts/infra-machine.md#inframachine-conditions) +- InfraMachine SHOULD report [terminal failures](../../providers/contracts/infra-machine.md#inframachine-terminal-failures) + +Similarly, in order to support different machine bootstrappers, The Machine resource references +an BootstrapConfig object, e.g. KubeadmBoostrapConfig etc. + +The [BootstrapConfig resource contract](../../providers/contracts/bootstrap-config.md) defines a set of rules a provider is expected to comply in order to allow +the expected interactions with the Machine controller. -The Machine controller's main responsibilities are: +Considering all the info above, the Machine controller's main responsibilities are: -* Setting an OwnerReference on: - * Each Machine object to the Cluster object. - * The associated BootstrapConfig object. - * The associated InfrastructureMachine object. -* Copy data from `BootstrapConfig.Status.DataSecretName` to `Machine.Spec.Bootstrap.DataSecretName` if -`Machine.Spec.Bootstrap.DataSecretName` is empty. -* Setting NodeRefs to be able to associate machines and Kubernetes nodes. -* Deleting Nodes in the target cluster when the associated machine is deleted. -* Cleanup of related objects. -* Keeping the Machine's Status object up to date with the InfrastructureMachine's Status object. -* Finding Kubernetes nodes matching the expected providerID in the workload cluster. +* Setting an OwnerReference on the infrastructure object referenced in `Machine.spec.infrastructureRef`. +* Setting an OwnerReference on the bootstrap object referenced in `Machine.spec.bootstrap.configRef`. +* Keeping the Machine's status in sync with the InfraMachine and BootstrapConfig's status. + * Finding Kubernetes nodes matching the expected providerID in the workload cluster. + * Setting NodeRefs to be able to associate machines and Kubernetes nodes. + * Monitor Kubernetes nodes and propagate labels to them. +* Cleanup of all owned objects so that nothing is dangling after deletion. + * Drain nodes and wait for volume being detached by the CSI provider. + +![](../../../images/cluster-admission-machine-controller.png) After the machine controller sets the OwnerReferences on the associated objects, it waits for the bootstrap and infrastructure objects referenced by the machine to have the `Status.Ready` field set to `true`. When @@ -25,108 +44,3 @@ The machine controller uses the kubeconfig for the new workload cluster to watch When a node appears with `Node.Spec.ProviderID` matching `Machine.Spec.ProviderID`, the machine controller transitions the associated machine into the `Provisioned` state. When the infrastructure ref is also `Ready`, the machine controller marks the machine as `Running`. - -## Contracts - -### Cluster API - -Cluster associations are made via labels. - -#### Expected labels - -| what | label | value | meaning | -| --- | --- | --- | --- | -| Machine | `cluster.x-k8s.io/cluster-name` | `` | Identify a machine as belonging to a cluster with the name ``| -| Machine | `cluster.x-k8s.io/control-plane` | `true` | Identifies a machine as a control-plane node | - -### Bootstrap provider - -The BootstrapConfig object **must** have a `status` object. - -To override the bootstrap provider, a user (or external system) can directly set the `Machine.Spec.Bootstrap.Data` -field. This will mark the machine as ready for bootstrapping and no bootstrap data will be copied from the -BootstrapConfig object. - -#### Required `status` fields - -The `status` object **must** have several fields defined: - -* `ready` - a boolean field indicating the bootstrap config data is generated and ready for use. -* `dataSecretName` - a string field referencing the name of the secret that stores the generated bootstrap data. - -#### Optional `status` fields - -The `status` object **may** define several fields that do not affect functionality if missing: - -* `failureReason` - a string field explaining why a fatal error has occurred, if possible. -* `failureMessage` - a string field that holds the message contained by the error. - -Note: once any of `failureReason` or `failureMessage` surface on the machine who is referencing the bootstrap config object, -they cannot be restored anymore (it is considered a terminal error; the only way to recover is to delete and recreate the machine). -Also, if the machine is under control of a MachineHealthCheck instance, the machine will be automatically remediated. - -Example: - -```yaml -kind: MyBootstrapProviderConfig -apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3 -status: - ready: true - dataSecretName: "MyBootstrapSecret" -``` - -### Infrastructure provider - -The InfrastructureMachine object **must** have both `spec` and `status` objects. - -#### Required `spec` fields - -The `spec` object **must** at least one field defined: - -* `providerID` - a cloud provider ID identifying the machine. - -#### Optional `spec` fields - -The `spec` object **may** define several fields that do not affect functionality if missing: - -* `failureDomain` - is a string identifying the failure domain the instance is running in. - -#### Required `status` fields - -The `status` object **must** at least one field defined: - -* `ready` - a boolean field indicating if the infrastructure is ready to be used or not. - -#### Optional `status` fields - -The `status` object **may** define several fields that do not affect functionality if missing: - -* `failureReason` - is a string that explains why a fatal error has occurred, if possible. -* `failureMessage` - is a string that holds the message contained by the error. -* `addresses` - is a `MachineAddresses` (a list of `MachineAddress`) which represents host names, external IP addresses, internal IP addresses, -external DNS names, and/or internal DNS names for the provider's machine instance. `MachineAddress` is -defined as: - - `type` (string): one of `Hostname`, `ExternalIP`, `InternalIP`, `ExternalDNS`, `InternalDNS` - - `address` (string) - -Note: once any of `failureReason` or `failureMessage` surface on the machine who is referencing the infrastructureMachine object, -they cannot be restored anymore (it is considered a terminal error; the only way to recover is to delete and recreate the machine). -Also, if the machine is under control of a MachineHealthCheck instance, the machine will be automatically remediated. - -Example: -```yaml -kind: MyMachine -apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3 -spec: - providerID: cloud:////my-cloud-provider-id -status: - ready: true -``` - -### Secrets - -The Machine controller will create a secret or use an existing secret in the following format: - -| secret name | field name | content | -|:---:|:---:|---| -|`-kubeconfig`|`value`|base64 encoded kubeconfig that is authenticated with the child cluster| diff --git a/docs/book/src/developer/providers/contracts/infra-cluster.md b/docs/book/src/developer/providers/contracts/infra-cluster.md index 0908e1e33e87..51242cd5d88f 100644 --- a/docs/book/src/developer/providers/contracts/infra-cluster.md +++ b/docs/book/src/developer/providers/contracts/infra-cluster.md @@ -119,7 +119,7 @@ rules: - watch ``` -Note: The write permissions allow the Cluster controller to set owner references and labels on the InfraCluster” resources; +Note: The write permissions allow the Cluster controller to set owner references and labels on the InfraCluster resources; write permissions are not used for general mutations of InfraCluster resources, unless specifically required (e.g. when using ClusterClass and managed topologies). @@ -271,7 +271,7 @@ Each InfraCluster MUST report when Cluster's infrastructure is fully provisioned ```go type FooClusterStatus struct { - // Ready denotes that the foo cluster infrastructure fully provisioned. + // Ready denotes that the foo cluster infrastructure is fully provisioned. // +optional Ready bool `json:"ready"` @@ -282,7 +282,7 @@ type FooClusterStatus struct { Once `status.ready` the Cluster "core" controller will bubbles up this info in Cluster's `status.infrastructureReady`; If defined, also InfraCluster's `spec.controlPlaneEndpoint` and `status.failureDomains` will be surfaced on Cluster's -corresponding field at the same time. +corresponding fields at the same time.