-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent reporting of paused resources #10130
Comments
Just for my understanding:
kubectl wait without further configuration already looks for the Paused condition? Or would we pass the condition name into the kubectl wait command? |
I'd expect the end result would be users could do There are plenty of examples through the kube docs of waiting |
Discussed with Joel a bit more about this issue before opening it, while trying to understand how to detect that the controller actually acknowledged that a resource was paused it seemed almost impossible to do without a new field or condition (easier to add). Another problem this solves is pausing/unpausing against stale caches. Generally +1 on having some sort of contract for the paused field, which can give users a signal to proceed with other operations. |
/triage accepted |
Sounds reasonable in general. Thinking a bit about it, something like:
wouldn't work at all. Because basically all objects in CAPI can be paused without changes to these objects, simply by pausing the Cluster object. |
Two comments from my side:
|
My point was about that we need something additional to what we have today because it is not enough with the current implementation to just check for .status.observedGeneration. |
Hey all 👋 Happy to come along to the call later on to discuss this, I'm fairly new to CAPI so all input appreciated! :) I'm not entirely following the above - I think that the way we currently check if a resource is paused checks if the cluster is paused by looking at the |
That is how our controllers are checking if the current resource is paused. What I meant was the way that a user of CAPI (or another tool building on top of CAPI) would check if a CAPI controller already "noticed" that a resource is paused (and thus stopped reconciling it further). Concrete example:
|
Ah right, I think I see! This then justifies the need for a condition, correct? |
I think it justifies the addition of something :). But a condition also seems like a natural choice to me |
I've had a little bit of a think about the points raised by @fabriziopandini along with @damdo. (I'm still trying to understand CAPI so bear with me :) ) For the example Stefan outlined above I think the trade off is: how quickly (or slowly) are we willing to have the system react (and therefore update the condition on a child resource) to a cluster being paused. If we want a more reactive system, then we need to trade off against more requests to the API server in normal (un paused) operation. Currently, I think it's probably best to accept a potentially slower to react system to avoid this. (i.e accept there may be some delay, and that the system is eventually consistent) Currently, we do watch on clusters - so if there is a change to the cluster object we will re-reconcile. However, we exclude if the cluster is paused. AFAIK This means we could pause a cluster, then not reconcile any of the child resources (e.g an MD). Once the MD controller would reconcile, it could be using a local cache - and therefore not realise it's been paused. This means that the messaging around the use of the conditions would need to clearly state that, for any potential consumer of the condition that they may need to wait for some time (up to the re-sync) for the controllers to stop reconciling. The condition would only be present once this happened however... We could change the current code by updating the watch to not exclude paused clusters, and then to also use a direct client for the get when checking if the cluster is paused. This would ensure that once the However, I'm not convinced this is the right choice - It would mean that in normal operation we would see more traffic to the api server (as the get of the cluster object would no longer be cached). This feels a bit like optimising for the wrong thing (the majority of clusters I think we expect not to be paused for the majority of the time) As I'm quite new to CAPI, I'm not sure if there are other trade offs here I'm missing. Does this sound reasonable? :) I'm also not entirely sure that I understand how the contract with providers is impacted here? (Sorry, again I'm new! 😅) If we want to make sure everything reflects the paused condition, then I think that I get how it could be a potential change to the contract. I'm not entirely sure this is what Joel was suggesting though... So.. to summarise:
|
I would say > 99% of our API calls are hitting the local cache (we did some performance optimizations in CAPI v1.5.0).
I didn't go through all controllers, but if the exclude is done via the
I think no:
So once we drop the
See above about order of CR updating the local cache and then broadcasting the event (basically we don't have to use a direct client).
Nice analysis so far. With the one miss about ordering in CR :) (assuming I'm correct, but I spent quite some time on it when doing performance optimization, would be good to double check though) Re: contract I think this comes down to, if we expect to have the "condition on each resources status" and each includes infra / bootstrap / control plane provider resources, then we have to establish a contract that everyone should add the condition in basically the same way as core CAPI. One option here could be:
|
Oh that's cool and good to know! 😄
Ok - I will give this a shot on my WIP PR!
I see - I wasn't aware of how this works, but that's useful as it makes reasoning about this easier :)
Ok, so if im understanding you correctly, this won't be instant but it should be relatively fast?
I'll have a dig into the controller runtime code and see if I come to the same understanding, thanks! :) RE Contract: This makes sense to me :) Having slept on it a bit, I think that it may be confusing for an end user if this is something present on only core capi resources, and only some provider ones - so eventually making the optional contract mandatory makes sense! |
It will be relatively fast, yup (I guess usually at most a few milliseconds). But more importantly it guarantees that we don't have to wait for a resync as it guarantees that the change that set the pause will trigger an event that will trigger a reconcile that will "see" the pause. If there is currently a reconcile in progress when the event arrives another reconcile will be triggered.
I simplified it a bit :). This basically comes down to that step 5 happens before step 6 here: https://github.com/kubernetes/sample-controller/blob/master/docs/images/client-go-controller-interaction.jpeg |
@sbueringer, I've had a stab at updating my WIP PR: #10364 Please PTAL when you have a second! I'll try and update comments etc tomorrow! |
@theobarberbany I think before going back and forth on an implementation we should cover this point:
|
I do think this needs to be discussed in office hours and maybe have a collaborative doc, because I quite like the option Stefan posted above. However, there might be something I'm missing and a different approach could be helpful.
|
From the April 10 2024 community meeting: If we'd like to propagate the paused condition across providers, we should author a proposal to make it clear what's expected. This could be a PR to amend the existing contract. @sbueringer or @fabriziopandini which proposal(s) are relevant to this contract? I looked in the proposals directory, but none of them were obviously covering the contract to me. The proposal would need to provide an example or tool of what providers need to do in order to conform. This might be a pseudo-code algorithm for observing the pause at the top of the hierarchy and propagating it down. This might also apply to bootstrap providers; we should survey the resources affected and document them in the proposal PR(s). For core CAPI, we can implement it now. For providers, we can make this optional and eventually make it required. No objections to the idea in general, seems like folks are in agreement with the proposed approach. |
Just want to note that a |
We don't have a proposal for the contract. What I meant is a PR to update the contract instead of an additional proposal. The proposal is currently only documented in our book:
(I hope I didn't forget any)
I think what it comes down to is that every controller should have something like this: (check the current & the cluster object for pause) // Return early if the object or Cluster is paused.
if annotations.IsPaused(cluster, deployment) {
log.Info("Reconciliation is paused for this object")
return ctrl.Result{}, nil
} And when we hit this case we should set the Paused condition to true, otherwise false (I think we have to be careful that we set the Paused condition to false before we start making changes again after unpause). So I think what we can do is describe the expectation and then provide an example in some way. As far as I'm aware we only check current object + cluster today in core CAPI. We don't propagate it downwards (e.g. MD => MS => Machine => InfraMachine / BootstrapConfig). Basically each controller can be paused by either pausing the cluster or the reconciled object.
Thx for bringing this up. The issue (based on it's title) would have had some impact, but I looked over the corresponding PR and it seems very orthogonal to me (it seems to come down to an annotation that blocks clusterctl move, but pause is not affected as far as I can tell). @fabriziopandini I think you reviewed that PR can you maybe double check that I'm not missing something? |
This is correct, "clusterctl.cluster.x-k8s.io/block-move" is orthogonal to pause |
Yeah, I realize this isn't how it works today. My understanding of the request is that the condition would be propagated, though. Off the top of my head, if the controllers view that the To spell it out explicitly - I think the intent is to put a condition on Cluster, MD, MS, Machine, and InfraMachine. If I'm misreading it @JoelSpeed can correct me. |
My understanding was also that we want the condition everywhere. What I meant with we don't propagate is the following: Or phrased differently:
For example a Machine wouldn't be paused just because its MachineSet is paused. |
I see - so all resources check themselves and the |
/priority important-longterm |
Yup. And I think it makes sense. Basically this allows you to:
|
/assign @theobarberbany |
@sbueringer @fabriziopandini, I hope you're having a good friday! :) I've had a crack at what I've understood about updating the contract! PTAL and let me know your thoughts :) |
I've also noticed, in these same documents (or anywhere in the book that I could search) we don't specifically state the expectation that controllers will observe the paused annotation, or |
I wouldn't be surprised a lot of these things were just copy&pasted across providers as far as I know. |
Probably the best documentation we have about this is in https://cluster-api.sigs.k8s.io/clusterctl/provider-contract#move (as far as I remember paused have been introduced to make move more robust, but might be I'm wrong) |
Just fyi, this has also been included in the proposal for the new v1beta2 conditions: #10897 Based on the first round of positive feedback, this transition is going to happen in the CAPI 1.9 timeframe, so we should make a call if to implement this directly or defer. |
There is an edge case for this. The paused condition will tell you if the Cluster controller has seen your request to pause the resource, but if you want to be sure that also the Topology controller have seen it, you have to check also the TopologyReconciled condition |
Note: the above implements the Paused condition at for v1beta2 conditions for most controllers. Also uses a shared util function which may help to adopt it, |
What would you like to be added (User Story)?
As a user of Cluster API, I would like to see a condition on each resources status that tells me if a resource is paused, so that I know that controllers have seen my pause request and acknowledged it before operate on the resources manually.
Detailed Description
Kubernetes is a distributed and event driven system, with a tendency for caching. This means that, if I set
spec.paused
on a resource, the controller for it may be processing an older event from the same resource, and it may take some amount of time before it reconciles the event relating to my change to pause the resource.Before I make my changes, I want to be sure that the controller has seen my request to pause the resource, so that I am certain that no further changes will be made.
To do this, I think we should make all controllers add a condition that shows whether or not the controller believes they are paused or not.
In the general running case:
However, when paused, I want to see
This will allow integrations with
kubectl wait
to wait for the condition before continuing.Anything else you would like to add?
Alternatively, you could set this as a
pausedGeneration
but I feel a condition is more expressive for this use case.It would be good to define this as part of the contracts so that every CAPI controller eventually sets this in a consistent way.
Label(s) to be applied
/kind feature
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.
The text was updated successfully, but these errors were encountered: