Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAPVCD Control plane anti affinity rules #2941

Closed
Tracked by #2444
vxav opened this issue Nov 3, 2023 · 2 comments
Closed
Tracked by #2444

CAPVCD Control plane anti affinity rules #2941

vxav opened this issue Nov 3, 2023 · 2 comments

Comments

@vxav
Copy link

vxav commented Nov 3, 2023

The control plane nodes should be spread on different vSphere hosts.

Filed upstream issue: vmware/cluster-api-provider-cloud-director#539

@vxav
Copy link
Author

vxav commented May 16, 2024

In a vSphere cluster, when an ESXi host fails, vSphere HA will restart the VMs running on it after a timeout.

To simulate a host failure where 2 CP nodes would be impacted, we can hard reboot 2 VMs at the same time to observe behaviour.

  • After hard rebooting 2 CP nodes at the same time, the API becomes unavailable and comes back when the VMs restart.
  • I also tried with all 3 control plane nodes and everything came back ok.

tl,dr: Having an anti-affinity rule would be nice as it would reduce downtime of the API in case of hardware/network failure, but we don't need to prioritise and take on the implementation of the feature.

@gawertm I propose that we move this issue to waiting in case a PE ever wants to take this on.

@vxav
Copy link
Author

vxav commented Jul 9, 2024

Given the recent news about the upstream project, I will close this issue as we won't implement this ourselves anyway.

@vxav vxav closed this as completed Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

1 participant