-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Metal3DataTemplate: requeue if reconcileDelete did not clear finalizer #1997
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @tmmorin. Thanks for your PR. I'm waiting for a metal3-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
d52ea7e
to
bd19a54
Compare
/ok-to-test |
/test metal3-centos-e2e-integration-test-main metal3-ubuntu-e2e-integration-test-main |
Fix the DCO by signing off the commit, and unit tests are failing. |
5ada41d
to
f953fc7
Compare
Signed-off-by: Thomas Morin <[email protected]>
f953fc7
to
555fb56
Compare
@tuminoid I updated the unit tests, and also the code to also do requeues on failure when calling dataTemplateMgr.UpdateDatas - I updated the PR description accordingly. |
/test metal3-centos-e2e-integration-test-main metal3-ubuntu-e2e-integration-test-main |
I had a interesting discussion by @pierrecregut on #1994, leading to the conclusion that the cleanest solution would be to fix the miscomputation of indexes. Indeed, requeuing here is somehow sweeping dust under the rug, even if it reaches the goal of ultimately avoiding finalizer remaining stuck. We'll work with Pierre to push an alternative solution fixing the miscomputation of indexes in UpdateDatas. Feedback is welcome on whether it's worth keeping my requeue proposal, as a kind of safeguard despite the "sweeping dust under the rug" aspect. |
The alternative implementation is here: #2000 |
I find the solution proposed by @pierrecregut in #2000 much nicer, and if reviewers' feedback is positive on #2000, I'll close this PR here. |
closing this PR, #2000 is the right solution |
What this PR does / why we need it:
This MR is a proposal to address #1994.
The reconcileDelete implementation of the Metal3DataTemplate controller needs to requeue if it decided that it can't yet unset the finalizer on the Metal3DataTemplate. If no requeue is made, there's a scenario where Metal3Data still exists when reconcileDelete is first called, which results in not unsetting the finalize - in that situation, if no change of the Metal3DataTemplate triggers a new reconciliation, the finalizer never gets unset.
To finalize this PR I had to adjust the expectations of the unit test (expect requeue where this initially wasn't expected).
This made we realize that there is another situation where we want to requeue: the case where
dataTemplateMgr.UpdateDatas
returns an error. The code was relying on checkReconcileError, which (a) can only return requeue results on baremetal.ReconcileError errors (not on other errors), and (b) even for those would not return a requeue for "Terminal" errors, which I is I think only a valid choice for when this function is called for reconcileNormal ; for deletes, it seems that requeues should always be done to avoid the resource remaining stuck on a finalizer.