-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Node Repair implementation #1793
base: main
Are you sure you want to change the base?
feat: Node Repair implementation #1793
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: engedaam The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Pull Request Test Coverage Report for Build 11874233136Details
💛 - Coveralls |
4635c80
to
192984f
Compare
2338123
to
8cefba7
Compare
c8bed26
to
390c056
Compare
390c056
to
562ed1f
Compare
03e110a
to
cf6220e
Compare
b20e724
to
8397b07
Compare
445cb11
to
be92bc0
Compare
ffb2103
to
f0186f9
Compare
f0186f9
to
cab6157
Compare
ctx = injection.WithControllerName(ctx, "node.health") | ||
ctx = log.IntoContext(ctx, log.FromContext(ctx).WithValues("Node", klog.KRef(node.Namespace, node.Name))) | ||
|
||
// Validate that the node is owned by us and is not being deleted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Validate that the node is owned by us and is not being deleted | |
// Validate that the node is owned by us |
nit: we aren't doing the deletion check anymore
cpPolicyFound := cloudprovider.RepairPolicy{} | ||
// Find a node with a condition that matches one of the unhealthy conditions defined by the cloud provider | ||
// If there are multiple unhealthy status condition we will requeue based on the condition closest to its terminationDuration | ||
for _, policy := range c.cloudProvider.RepairPolicies() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This feels like it should be a separate function given it feels like it has "arguments" and it's pretty self-contained
return reconcile.Result{}, client.IgnoreNotFound(err) | ||
} | ||
if err := c.kubeClient.Delete(ctx, nodeClaim); err != nil { | ||
return reconcile.Result{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This delete could have a NotFound error -- consider adding a client.IgnoreNotFound
Fixes #N/A
Description
RepairPolicy
that will support node conditions that Karpenter will forcefully terminate nodes. The cloud provider policies will be unhealthy conditions a node can enter and the duration for Karpenter to react.How was this change tested?
make resubmit
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.