Skip to content

Commit

Permalink
Merge pull request #848 from Nordix/Humain-readable-docs-e2e/mohammed
Browse files Browse the repository at this point in the history
🌱 Add high level description to e2e tests
  • Loading branch information
metal3-io-bot authored Jul 25, 2023
2 parents c874115 + 974a5b8 commit 976b904
Show file tree
Hide file tree
Showing 12 changed files with 175 additions and 0 deletions.
1 change: 1 addition & 0 deletions test/e2e/inspection.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ type InspectionInput struct {
SpecName string
}

// Inspection test request inspection on all the available BMH using annotation.
func inspection(ctx context.Context, inputGetter func() InspectionInput) {
Logf("Starting inspection tests")
input := inputGetter()
Expand Down
11 changes: 11 additions & 0 deletions test/e2e/integration_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,17 @@ import (
"sigs.k8s.io/cluster-api/test/framework"
)

// Integration tests in CAPM3 focus on validating the seamless integration between different components of the CAPM3 project,
// including CAPM3, IPAM, CAPI, BMO, and Ironic. These tests ensure that these components work together cohesively to provision a
// workload cluster and perform pivoting operations between the bootstrap cluster and the target cluster.
// The primary goal is to detect any compatibility issues or conflicts that may arise during integration.

// By executing the integration tests, CAPM3 verifies that:

// - The CAPM3 controller effectively interacts with the IPAM, CAPI, BMO, and Ironic components.
// - The provisioning of a workload cluster proceeds smoothly, and that BMHs are created, inspected and provisioned as expected.
// - The pivoting functionality enables the seamless moving of resources and control components from the bootstrap cluster to the target cluster and vice versa.
// - Deprovisioning the cluster and BMHs happens smoothly.
var _ = Describe("When testing integration [integration]", func() {

It("CI Test Provision", func() {
Expand Down
18 changes: 18 additions & 0 deletions test/e2e/live_iso_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,28 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client"
)

/*
* The purpose of the live-iso feature in Metal3 is to allow booting a BareMetalHost with a live ISO image instead of deploying an image to the local disk using the IPA deploy ramdisk. This feature is useful in scenarios where reducing boot time for ephemeral workloads is desired, or when integrating with third-party installers distributed as a CD image.
*
* This test demonstrates the usage of the live-iso feature. It performs the following steps:
*
* The live ISO image URL is retrieved from the test configuration.
* The list of bare metal hosts (BMHs) in the namespace is displayed.
* It waits for all BMHs to be in the "Available" state.
* It retrieves all BMHs and selects the first available BMH that supports the "redfish-virtualmedia" mechanism for provisioning the live image.
* The selected BMH is updated with the live ISO image URL and marked as online.
* It waits for the BMH to transition to the "Provisioned" state, indicating successful booting from the live ISO image.
* The list of BMHs in the namespace is displayed.
* Serial logs are read to verify that the node was booted from the live ISO image.
* The test is considered passed.
*/

var _ = Describe("When testing live iso [live-iso] [features]", func() {
liveIsoTest()
})

// Live iso tests provision live-iso image on host
// it lists all the bmh and selects the one supporting redfish-virtualmedia for provisioning the live image.
func liveIsoTest() {
BeforeEach(func() {
validateGlobals(specName)
Expand Down
28 changes: 28 additions & 0 deletions test/e2e/metal3remediation.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,34 @@ type Metal3RemediationInput struct {
Namespace string
}

/*
* Metal3 Remediation Test
*
* This test evaluates node deletion in reboot remediation feature added to CAPM3 Remediation Controller.
* issue #392: Reboot remediation is incomplete
* PR #668: Fix reboot remediation by adding node deletion
* This test evaluates the reboot remediation strategy with an enhancement of node deletion in the CAPM3 (Cluster API Provider for Metal3) Remediation Controller.
*
* Tested Feature:
* - Delete Node in Reboot Remediation
*
* Workflow:
* 1. Retrieve the Metal3Machines associated with the worker nodes in the cluster.
* 2. Identify the target worker machine node its associated BMH object corresponding to the Metal3Machine.
* 3. Create a Metal3Remediation resource, specifying the remediation strategy as "Reboot" with a retry limit and timeout.
* 4. Wait for the VM (Virtual Machine) associated with target BMH to power off.
* 5. Wait for the target worker node to be deleted from the cluster.
* 6. Wait for the VMs to power on.
* 7. Verify that the target worker node becomes ready.
* 8. Verify that the Metal3Remediation resource is successfully delete
*
* Metal3Remediation test ensures that Metal3 Remediation Controller can effectively remediate worker nodes by orchestrating
* the reboot process and validating the successful recovery of the nodes. It helps ensure the stability and
* resiliency of the cluster by allowing workloads to be seamlessly migrated from unhealthy nodes to healthy node
*
* TODO: Add full metal3remediation test issue #1060: Add Healthcheck Test to E2E for CAPM3.
*/

func metal3remediation(ctx context.Context, inputGetter func() Metal3RemediationInput) {
Logf("Starting metal3 remediation tests")
input := inputGetter()
Expand Down
1 change: 1 addition & 0 deletions test/e2e/node_reuse.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ type NodeReuseInput struct {
Namespace string
}

// NodeReuse verifies the feature of reusing the same node after upgrading kcp/md nodes.
func nodeReuse(ctx context.Context, inputGetter func() NodeReuseInput) {
Logf("Starting node reuse tests [node_reuse]")
input := inputGetter()
Expand Down
1 change: 1 addition & 0 deletions test/e2e/pivoting.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ type PivotingInput struct {
ClusterctlConfigPath string
}

// Pivoting implements a test that verifies successful moving of management resources (CRs, BMO, Ironic) to a target cluster after initializing it with Provider components.
func pivoting(ctx context.Context, inputGetter func() PivotingInput) {
Logf("Starting pivoting tests")
input := inputGetter()
Expand Down
51 changes: 51 additions & 0 deletions test/e2e/pivoting_based_feature_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,57 @@ var (
workerMachineCount int64
)

/*
* Pivoting-based feature tests
* This test evaluates capm3 feature in a pivoted cluster, which means migrating resources and control components from
* bootsrap to target cluster
*
* Tested Features:
* - Create a workload cluster
* - Pivot to self-hosted
* - Test Certificate Rotation
* - Test Node Reuse
*
* Pivot to self-hosted:
* - the Ironic containers removed from the source cluster, and a new Ironic namespace is created in the target cluster.
* - The provider components are initialized in the target cluster using `clusterctl.Init`.
* - Ironic is installed in the target cluster, followed by the installation of BMO.
* - The stability of the API servers is checked before proceeding with the move to self-hosted.
* - The cluster is moved to self-hosted using `clusterctl.Move`.
* - After the move, various checks are performed to ensure that the cluster resources are in the expected state.
* - If all checks pass, the test is considered successful.
*
* Certificate Rotation:
* This test ensures that certificate rotation in the Ironic pod is functioning correctly.
* by forcing certificate regeneration and verifying container restarts
* - It starts by checking if the Ironic pod is running. It retrieves the Ironic deployment and waits for the pod to be in the "Running" state.
* - The test forces cert-manager to regenerate the certificates by deleting the relevant secrets.
* - It then waits for the containers in the Ironic pod to be restarted. It checks if each container exists and compares the restart count with the previously recorded values.
* - If all containers are restarted successfully, the test passes.
*
* Node Reuse:
* This test verifies the feature of reusing the same node after upgrading Kubernetes version in KubeadmControlPlane (KCP) and MachineDeployment (MD) nodes.
* - The test starts with a cluster containing 3 KCP (Kubernetes control plane) nodes and 1 MD (MachineDeployment) node.
* - The control plane nodes are untainted to allow scheduling new pods on them.
* - The MachineDeployment is scaled down to 0 replicas, ensuring that all worker nodes will be deprovisioned. This provides 1 BMH (BareMetalHost) available for reuse during the upgrade.
* - The code waits for one BareMetalHost (BMH) to become available, indicating that one worker node is deprovisioned and available for reuse.
* - The names and UUIDs of the provisioned BMHs before the upgrade are obtained.
* - An image is downloaded, and the nodeReuse field is set to True for the existing KCP Metal3MachineTemplate to reuse the node.
* - A new Metal3MachineTemplate with the upgraded image is created for the KCP.
* - The KCP is updated to upgrade the Kubernetes version and binaries. The rolling update strategy is set to update one machine at a time (MaxSurge: 0).
* - The code waits for one machine to enter the deleting state and ensures that no new machines are in the provisioning state.
* - The code waits for the deprovisioning BMH to become available again.
* - It checks if the deprovisioned BMH is reused for the next provisioning.
* - The code waits for the second machine to become running and updated with the new Kubernetes version.
* - The upgraded control plane nodes are untainted to allow scheduling worker pods on them.
* - Check all control plane nodes become running and update with the new Kubernetes version.
* - The names and UUIDs of the provisioned BMHs after the upgrade are obtained.
* - The difference between the mappings before and after the upgrade is checked to ensure that the same BMHs were reused.
* - Similar steps are performed to test machine deployment node reuse.
*
* Finally, the cluster is re-pivoted and cleaned up.
*/

var _ = Describe("Testing features in ephemeral or target cluster [pivoting] [features]", func() {

BeforeEach(func() {
Expand Down
34 changes: 34 additions & 0 deletions test/e2e/remediation.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,40 @@ type RemediationInput struct {
ClusterctlConfigPath string
}

/*
* Remediation Test
* The remediation test focuses on verifying various annotations and actions related to remediation in the CAPM3.
* It ensures that the cluster can recover from different failure scenarios and perform necessary actions for remediation.
*
* Test Steps:
* - 1. Reboot Annotation: This step marks a worker BareMetalHost (BMH) for reboot and waits for the associated Virtual Machine (VM) to transition to the "shutoff" state and then to the "running" state.
* - 2. Poweroff Annotation: The test verifies the power off and power on actions by turning off and on the specified machines.
* - 3. Inspection Annotation: The test runs an inspection test alongside the remediation steps. The inspection test verifies the inspection annotation functionality.
* - 4. Unhealthy Annotation: This step tests the unhealthy annotation by marking a BMH as unhealthy and ensuring it is not picked up for provisioning.
* - 5. Metal3 Data Template: The test creates a new Metal3DataTemplate (M3DT), then creates a new Metal3MachineTemplate (M3MT), and updates the MachineDeployment (MD) to point to the new M3MT. It then waits for the old worker to deprovision.
*
* The following code snippet represents the workflow of the remediation test:
*
* // Function: remediation
*
* func remediation(ctx context.Context, inputGetter func() RemediationInput) {
* Logf("Starting remediation tests")
* input := inputGetter()
* // Step 1: Reboot Annotation
* // ...
* // Step 2: Poweroff Annotation
* // ...
* // Step 3: Inspection Annotation
* // ...
* // Step 4: Unhealthy Annotation
* // ...
* // Step 5: Metal3 Data Template
* // ...
* Logf("REMEDIATION TESTS PASSED!")
* }
*
* The remediation test ensures that the CAPM3 can effectively remediate worker nodes by performing necessary actions and annotations. It helps ensure the stability and resiliency of the cluster by allowing the cluster to recover from failure scenarios and successfully restore nodes to the desired state.
*/
func remediation(ctx context.Context, inputGetter func() RemediationInput) {
Logf("Starting remediation tests")
input := inputGetter()
Expand Down
27 changes: 27 additions & 0 deletions test/e2e/remediation_based_feature_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,33 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client"
)

/*
* Remediation-based Tests
* This test focus on verifying the effectiveness of fixes or remedial actions taken to address node failures.
* These tests involve simulating failure scenarios, triggering the remediation process, and then verifying that the remediation actions successfully restore the nodes to the desired state.
*
* Test Types:
* 1. Metal3Remediation Test: This test specifically evaluates the Metal3 Remediation Controller's node deletion feature in the reboot remediation strategy.
* 2. Remediation Test: This test focuses on verifying various annotations and actions related to remediation in the CAPM3 (Cluster API Provider for Metal3).
*
* Metal3Remediation Test:
* - Retrieve the list of Metal3 machines associated with the worker nodes.
* - Identify the target worker Metal3Machine and its corresponding BareMetalHost (BMH) object.
* - Create a Metal3Remediation resource with a remediation strategy of type "Reboot" and a specified timeout.
* - Wait for the associated virtual machine (VM) to power off.
* - Wait for the node (VM) to be deleted.
* - Wait for the VM to power on.
* - Wait for the node to be in a ready state.
* - Delete the Metal3Remediation resource.
* - Verify that the Metal3Remediation resource has been successfully deleted.
*
* Remediation Test:
* - Reboot Annotation: Mark a worker BMH for reboot and wait for the associated VM to transition to the "shutoff" state and then to the "running" state.
* - Poweroff Annotation: Verify the power off and power on actions by turning off and on the specified machines.
* - Inspection Annotation: Run an inspection test alongside the remediation steps to verify the inspection annotation functionality.
* - Unhealthy Annotation: Mark a BMH as unhealthy and ensure it is not picked up for provisioning.
* - Metal3 Data Template: Create a new Metal3DataTemplate (M3DT), create a new Metal3MachineTemplate (M3MT), and update the MachineDeployment (MD) to point to the new M3MT. Wait for the old worker to deprovision.
*/
var _ = Describe("Testing nodes remediation [remediation] [features]", func() {

var (
Expand Down
1 change: 1 addition & 0 deletions test/e2e/upgrade_baremetal_operator.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ type upgradeBMOInput struct {
SpecName string
}

// upgradeBMO upgrades BMO image to the latest.
func upgradeBMO(ctx context.Context, inputGetter func() upgradeBMOInput) {
Logf("Starting BMO containers upgrade tests")
input := inputGetter()
Expand Down
1 change: 1 addition & 0 deletions test/e2e/upgrade_ironic.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ type upgradeIronicInput struct {
SpecName string
}

// upgradeIronic upgrades ironic image to the latest.
func upgradeIronic(ctx context.Context, inputGetter func() upgradeIronicInput) {
Logf("Starting ironic containers upgrade tests")
input := inputGetter()
Expand Down
1 change: 1 addition & 0 deletions test/e2e/upgrade_kubernetes_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ type upgradeKubernetesInput struct {
Namespace string
}

// upgradeKubernetes implements a test upgrading the cluster nodes from an old k8s version to a newer version.
func upgradeKubernetes(ctx context.Context, inputGetter func() upgradeKubernetesInput) {
Logf("Starting Kubernetes upgrade tests")
input := inputGetter()
Expand Down

0 comments on commit 976b904

Please sign in to comment.