Supporting an Inplace Update Rollout Strategy for upgrading Workload Clusters #9489

dharmjit · 2023-09-25T07:44:50Z

User Story

As a Platform Operator managing the Kubernetes clusters in resource constraint environments(Non-HA) OR/AND Specialized customized environments, I want to upgrade Kubernetes clusters without rolling out new nodes.

Detailed Description

For use cases such as Single-Node Clusters with no spare capacity or even Multi-Node Clusters with VM/OS customizations for high-performance/low-latency workloads or dependency on local persistent storage, Upgrading a Workload Cluster via RollingUpdate rollout strategy could either be not feasible or a costly operation requiring to re-apply these customizations on newer nodes and hence more downtime.

CAPI uses/promotes the immutable Infrastructure principles for a range of advantages. With the emergence of Image-based OS upgrade techniques such as A/B partition OS upgrades or OSTree Filesystem OS upgrades which provide immutable OS characteristics, We could rethink CAPI providing another rollout strategy to update the K8s/OS for the workload clusters.

At a high level, below could be some of the requirements

To introduce a new rollout strategy to allow upgrading workload clusters without rolling out new nodes.
To support this new rollout strategy for both clusterclass as well as non-clusterclass clusters.
To support this new rollout strategy for both the control plane as well as worker nodes of a workload cluster.
To ensure this new rollout upgrade strategy is agnostic of Image-based OS upgrades underlying implementation (OSTree upgrades, AB partition upgrades, etc.)

Note: For highly available clusters in resource constraint environments, CAPI provides strategies like ScaleIn(KCP) and OnDelete(MD) for upgrades without requiring additional infra capacity.

Anything else you would like to add?

There are already some CAPI slack discussions/GH issues discussing in-place upgrade needs and probably folks already have some ideas or more use cases around this. It would be great to hear/discuss those in the comments and probably it would be beneficial to create a Working Group around this feature.

Some GH issues around In-place upgrades/mutability in CAPI and tagging folks part of these discussions

cc: @furkatgofurov7 @pacoxu @fabriziopandini @sbueringer @shivi28

Please feel free to add more folks interested in this feature.

/kind feature
/area upgrades

The text was updated successfully, but these errors were encountered:

fabriziopandini · 2023-09-25T09:25:10Z

/triage accepted
I personally think this is a great discussion to have, IMO the project is now at a stage where we have all the required tools/conditions to approach this topic with confidence.

g-gaston · 2023-10-16T19:57:01Z

/assign
The working group will collaborate on a design for this

fabriziopandini · 2024-04-11T18:41:21Z

/priority backlog

nickperry · 2024-04-17T11:46:19Z

As an operator of CAPI clusters at scale in regulated physical locations with bandwidth and compute hardware constraints, I would very much welcome this capability.

ahrtr · 2024-05-13T10:02:40Z

Thanks @fabriziopandini for pointing me to this issue (I was going to raise the same issue).

One of the problems of creating & removing nodes one by one is that you have to sync the etcd's data from the leader each time when you upgrade or update the cluster. It's definitely unnecessary, It would be great if we can avoid it by in-place rolling upgrading & updating.

guettli · 2024-07-11T04:17:42Z

@ahrtr please elaborate why it is a problem for you that etcd data needs to be synced again. I understand that it is network traffic which could be avoided, but please explain the pain of the current "delete and recreate". Etcd has the learner mode now, so that the etcd node will only join after it has synced.

ahrtr · 2024-07-11T09:54:43Z

It's a waste of network bandwidth, the leader needs to send snapshot to each of the followers when you delete & recreate each of them. Obviously it's unnecessary from etcd perspective.
It creates a window in which it reduces the failure tolerance. Assuming a 3 member cluster, it can tolerate one member failure. When you delete & recreate one member, before its data is in sync with the leader and promoted to a voting member, the cluster can tolerate 0 member failure.

guettli · 2024-07-11T10:25:09Z

It's a waste of network bandwidth, the leader needs to send snapshot to each of the followers when you delete & recreate each of them. Obviously it's unnecessary from etcd perspective.

Do you have numbers? How much data needs to sync? (in my current context I have a lot of smaller clusters, so it does not matter).

It creates a window in which it reduces the failure tolerance. Assuming a 3 member cluster, it can tolerate one member failure. When you delete & recreate one member, before its data is in sync with the leader and promoted to a voting member, the cluster can tolerate 0 member failure.

Wait a second. I thought Cluster-API does a scale-out during an upgrade. If you have 3 CP nodes, then a 4th node gets added, then an old node gets deleted. But maybe I am missing something.

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 25, 2023

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 25, 2023

g-gaston mentioned this issue Oct 16, 2023

📖 Add In-place upgrades feature group #9559

Merged

k8s-ci-robot assigned g-gaston Oct 16, 2023

neolit123 mentioned this issue Nov 2, 2023

Cluster API doesn't directly support certificate renewal #9662

Closed

k8s-ci-robot added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Apr 11, 2024

g-gaston linked a pull request Aug 7, 2024 that will close this issue

📖 Add In-place updates proposal #11029

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting an Inplace Update Rollout Strategy for upgrading Workload Clusters #9489

Supporting an Inplace Update Rollout Strategy for upgrading Workload Clusters #9489

dharmjit commented Sep 25, 2023 •

edited

Loading

fabriziopandini commented Sep 25, 2023

g-gaston commented Oct 16, 2023

fabriziopandini commented Apr 11, 2024

nickperry commented Apr 17, 2024 •

edited

Loading

ahrtr commented May 13, 2024 •

edited

Loading

guettli commented Jul 11, 2024

ahrtr commented Jul 11, 2024

guettli commented Jul 11, 2024

Supporting an Inplace Update Rollout Strategy for upgrading Workload Clusters #9489

Supporting an Inplace Update Rollout Strategy for upgrading Workload Clusters #9489

Comments

dharmjit commented Sep 25, 2023 • edited Loading

User Story

Detailed Description

Anything else you would like to add?

fabriziopandini commented Sep 25, 2023

g-gaston commented Oct 16, 2023

fabriziopandini commented Apr 11, 2024

nickperry commented Apr 17, 2024 • edited Loading

ahrtr commented May 13, 2024 • edited Loading

guettli commented Jul 11, 2024

ahrtr commented Jul 11, 2024

guettli commented Jul 11, 2024

dharmjit commented Sep 25, 2023 •

edited

Loading

nickperry commented Apr 17, 2024 •

edited

Loading

ahrtr commented May 13, 2024 •

edited

Loading