Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ machine: Introduce Deletion status field and add timestamps for drain and volumeDetach instead of using the condition #11166

Merged
merged 7 commits into from
Sep 30, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions api/v1beta1/machine_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -230,10 +230,32 @@ type MachineStatus struct {
// Conditions defines current service state of the Machine.
// +optional
Conditions Conditions `json:"conditions,omitempty"`

sbueringer marked this conversation as resolved.
Show resolved Hide resolved
// Deletion contains information relating to removal of the Machine.
// Only present when the Machine has a deletionTimestamp and is being removed from the cluster.
sbueringer marked this conversation as resolved.
Show resolved Hide resolved
// +optional
sbueringer marked this conversation as resolved.
Show resolved Hide resolved
Deletion MachineStatusDeletion `json:"deletion,omitempty"`
sbueringer marked this conversation as resolved.
Show resolved Hide resolved
}

// ANCHOR_END: MachineStatus

// MachineStatusDeletion is the deletion state of the Machine.
type MachineStatusDeletion struct {
chrischdi marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include information here related to the drain, such as the configuration for the timeout?

Since the drain fields are optional in the spec, it would be good perhaps to show the configured values here so that you can correlate between start time and expected end time just by looking at the status?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting idea, not really sure how we can represent we are waiting forever in a clear way (not showing timeout 0).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A value of -1 potentially? But you're right, timeout 0 is awkward 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the status that it is still waiting should then be part of the condition Deleting condition message? 🤔 (or as long as that one is not around, the DrainSucceeded condition message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, having this in the conditions seems the easiest way to address this

// NodeDrainStartTime is the time when the drain of the node started.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we can probably add some more context to this, what does it mean when it's not present for example, what does it mean if it has elapsed for some period and the Machine is still here, what hints can we give to end users?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to add some more context on both keys.

// Only present when the Machine has a deletionTimestamp, is being removed from the cluster
// and draining the node had been started.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would drain not have been started when there's a deletion timestamp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of waiting for pre-drain hooks is one thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the best pointer to overview the deletion process may be this: https://main.cluster-api.sigs.k8s.io/tasks/automated-machine-management/machine_deletions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it might be useful to point a user to, or at least give them a hint to why node draining might not have started, can we include a line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another one may be the node is excluded from the drain entirely (there is an annotation for it).

I shortened it to:

	// NodeDrainStartTime is the time when the drain of the node started and is used to determine
	// if the NodeDrainTimeout is exceeded.
	// Only present when the Machine has a deletionTimestamp and draining the node had been started.

Happy to get other suggestions though :-)

One way would be to say:

	// NodeDrainStartTime is the time when the drain of the node started and is used to determine
	// if the NodeDrainTimeout is exceeded.
	// Only present when the Machine has a deletionTimestamp and draining the node had been started.
    // NodeDrainStartTime may not be set because the deletion is blocked by a pre-drain hook or draining is skipped for the machine.

But doing the same for WaitForNodeVolumeDetachStartTime would get very verbose 🤔 :

	// WaitForNodeVolumeDetachStartTime is the time when waiting for volume detachment started
	// and is used to determine if the NodeVolumeDetachTimeout is exceeded.
	// Detaching volumes from nodes is usually done by CSI implementations and the current state
	// is observed from the node's `.Status.VolumesAttached` field.
	// Only present when the Machine has a deletionTimestamp and waiting for volume detachments had been started.
    // WaitForNodeVolumeDetachStartTime may not be set because the deletion is blocked by a pre-drain hook, stuck in drain or waiting for volume detachment is skipped for the machine.
    ```

// +optional
NodeDrainStartTime *metav1.Time `json:"nodeDrainStartTime,omitempty"`
sbueringer marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need a finish time? Same for the other field?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't think that drain or detach finish time is an important info to have in the API (users & SRE mostly care about what is going on now and eventually why it is stuck, rarely they care about at what happened and for this log a more exhaustive)
But no strong opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I know that these have started, how do I know that they have finished if I don't have some field to tell me? 🤔 I realise eventually the machine is going away, but, what if it gets stuck terminating the instance, will that show up somewhere and will I know that drain and volume detach are done?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do I know that they have finished if I don't have some field to tell me? 🤔

From controller-perspective we (at least currently) do not care:

  • either the controller tries to drain again which should be a no-op (happy path)
  • or the drain is skipped because the timeout is reached.

From the user perspective: the information where the deletion is at should be part of the Deleting condition I'd say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general the new Deleting condition should make clear at which phase of the deletion workflow we are (including making clear which parts are already completed)


// WaitForNodeVolumeDetachStartTime is the time when waiting for volume detachment started.
sbueringer marked this conversation as resolved.
Show resolved Hide resolved
// Detaching volumes from nodes is usually done by CSI implementations and the current state
// is observed from the node's `.Status.VolumesAttached` field.
// Only present when the Machine has a deletionTimestamp, is being removed from the cluster
// and waiting for volume detachments had been started.
// +optional
WaitForNodeVolumeDetachStartTime *metav1.Time `json:"nodeVolumeDetachStartTime,omitempty"`
}

// SetTypedPhase sets the Phase field to the string representation of MachinePhase.
func (m *MachineStatus) SetTypedPhase(p MachinePhase) {
m.Phase = string(p)
Expand Down
24 changes: 24 additions & 0 deletions api/v1beta1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

37 changes: 36 additions & 1 deletion api/v1beta1/zz_generated.openapi.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 22 additions & 0 deletions config/crd/bases/cluster.x-k8s.io_machines.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions internal/apis/core/v1alpha3/conversion.go
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ func (src *Machine) ConvertTo(dstRaw conversion.Hub) error {
dst.Spec.NodeVolumeDetachTimeout = restored.Spec.NodeVolumeDetachTimeout
dst.Status.NodeInfo = restored.Status.NodeInfo
dst.Status.CertificatesExpiryDate = restored.Status.CertificatesExpiryDate
dst.Status.Deletion = restored.Status.Deletion
return nil
}

Expand Down
1 change: 1 addition & 0 deletions internal/apis/core/v1alpha3/zz_generated.conversion.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions internal/apis/core/v1alpha4/conversion.go
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ func (src *Machine) ConvertTo(dstRaw conversion.Hub) error {
dst.Spec.NodeDeletionTimeout = restored.Spec.NodeDeletionTimeout
dst.Status.CertificatesExpiryDate = restored.Status.CertificatesExpiryDate
dst.Spec.NodeVolumeDetachTimeout = restored.Spec.NodeVolumeDetachTimeout
dst.Status.Deletion = restored.Status.Deletion
return nil
}

Expand Down
1 change: 1 addition & 0 deletions internal/apis/core/v1alpha4/zz_generated.conversion.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 17 additions & 10 deletions internal/controllers/machine/machine_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import (
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/tools/record"
"k8s.io/klog/v2"
"k8s.io/utils/ptr"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/builder"
"sigs.k8s.io/controller-runtime/pkg/client"
Expand Down Expand Up @@ -415,6 +416,10 @@ func (r *Reconciler) reconcileDelete(ctx context.Context, cluster *clusterv1.Clu
conditions.MarkFalse(m, clusterv1.VolumeDetachSucceededCondition, clusterv1.WaitingForVolumeDetachReason, clusterv1.ConditionSeverityInfo, "Waiting for node volumes to be detached")
}

if m.Status.Deletion.WaitForNodeVolumeDetachStartTime == nil {
m.Status.Deletion.WaitForNodeVolumeDetachStartTime = ptr.To(metav1.Now())
}

if ok, err := r.shouldWaitForNodeVolumes(ctx, cluster, m.Status.NodeRef.Name); ok || err != nil {
if err != nil {
r.recorder.Eventf(m, corev1.EventTypeWarning, "FailedWaitForVolumeDetach", "error waiting for node volumes detaching, Machine's node %q: %v", m.Status.NodeRef.Name, err)
Expand Down Expand Up @@ -527,34 +532,32 @@ func (r *Reconciler) nodeDrainTimeoutExceeded(machine *clusterv1.Machine) bool {
return false
}

// if the draining succeeded condition does not exist
if conditions.Get(machine, clusterv1.DrainingSucceededCondition) == nil {
// if the NodeDrainStartTime does not exist
if machine.Status.Deletion.NodeDrainStartTime == nil {
return false
}

now := time.Now()
firstTimeDrain := conditions.GetLastTransitionTime(machine, clusterv1.DrainingSucceededCondition)
diff := now.Sub(firstTimeDrain.Time)
diff := now.Sub(machine.Status.Deletion.NodeDrainStartTime.Time)
return diff.Seconds() >= machine.Spec.NodeDrainTimeout.Seconds()
}

// nodeVolumeDetachTimeoutExceeded returns False if either NodeVolumeDetachTimeout is set to nil or <=0 OR
// VolumeDetachSucceededCondition is not set on the Machine. Otherwise returns true if the timeout is expired
// since the last transition time of VolumeDetachSucceededCondition.
// WaitForNodeVolumeDetachStartTime is not set on the Machine. Otherwise returns true if the timeout is expired
// since the WaitForNodeVolumeDetachStartTime.
func (r *Reconciler) nodeVolumeDetachTimeoutExceeded(machine *clusterv1.Machine) bool {
// if the NodeVolumeDetachTimeout type is not set by user
if machine.Spec.NodeVolumeDetachTimeout == nil || machine.Spec.NodeVolumeDetachTimeout.Seconds() <= 0 {
return false
}

// if the volume detaching succeeded condition does not exist
if conditions.Get(machine, clusterv1.VolumeDetachSucceededCondition) == nil {
// if the NodeVolumeDetachStartTime does not exist
sbueringer marked this conversation as resolved.
Show resolved Hide resolved
if machine.Status.Deletion.WaitForNodeVolumeDetachStartTime == nil {
return false
}

now := time.Now()
firstTimeDetach := conditions.GetLastTransitionTime(machine, clusterv1.VolumeDetachSucceededCondition)
diff := now.Sub(firstTimeDetach.Time)
diff := now.Sub(machine.Status.Deletion.WaitForNodeVolumeDetachStartTime.Time)
return diff.Seconds() >= machine.Spec.NodeVolumeDetachTimeout.Seconds()
}

Expand Down Expand Up @@ -697,6 +700,10 @@ func (r *Reconciler) drainNode(ctx context.Context, cluster *clusterv1.Cluster,
log.V(3).Info("Node is unreachable, draining will use 1s GracePeriodSeconds and will ignore all Pods that have a deletionTimestamp > 1s old")
}

if machine.Status.Deletion.NodeDrainStartTime == nil {
sbueringer marked this conversation as resolved.
Show resolved Hide resolved
machine.Status.Deletion.NodeDrainStartTime = ptr.To(metav1.Now())
}

if err := drainer.CordonNode(ctx, node); err != nil {
// Machine will be re-reconciled after a cordon failure.
return ctrl.Result{}, errors.Wrapf(err, "failed to cordon Node %s", node.Name)
Expand Down
48 changes: 12 additions & 36 deletions internal/controllers/machine/machine_controller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1386,12 +1386,8 @@ func TestIsNodeDrainedAllowed(t *testing.T) {
},

Status: clusterv1.MachineStatus{
Conditions: clusterv1.Conditions{
{
Type: clusterv1.DrainingSucceededCondition,
Status: corev1.ConditionFalse,
LastTransitionTime: metav1.Time{Time: time.Now().Add(-(time.Second * 70)).UTC()},
},
Deletion: clusterv1.MachineStatusDeletion{
NodeDrainStartTime: &metav1.Time{Time: time.Now().Add(-(time.Second * 70)).UTC()},
},
},
},
Expand All @@ -1412,12 +1408,8 @@ func TestIsNodeDrainedAllowed(t *testing.T) {
NodeDrainTimeout: &metav1.Duration{Duration: time.Second * 60},
},
Status: clusterv1.MachineStatus{
Conditions: clusterv1.Conditions{
{
Type: clusterv1.DrainingSucceededCondition,
Status: corev1.ConditionFalse,
LastTransitionTime: metav1.Time{Time: time.Now().Add(-(time.Second * 30)).UTC()},
},
Deletion: clusterv1.MachineStatusDeletion{
NodeDrainStartTime: &metav1.Time{Time: time.Now().Add(-(time.Second * 30)).UTC()},
},
},
},
Expand All @@ -1437,12 +1429,8 @@ func TestIsNodeDrainedAllowed(t *testing.T) {
Bootstrap: clusterv1.Bootstrap{DataSecretName: ptr.To("data")},
},
Status: clusterv1.MachineStatus{
Conditions: clusterv1.Conditions{
{
Type: clusterv1.DrainingSucceededCondition,
Status: corev1.ConditionFalse,
LastTransitionTime: metav1.Time{Time: time.Now().Add(-(time.Second * 1000)).UTC()},
},
Deletion: clusterv1.MachineStatusDeletion{
NodeDrainStartTime: &metav1.Time{Time: time.Now().Add(-(time.Second * 1000)).UTC()},
},
},
},
Expand Down Expand Up @@ -1896,12 +1884,8 @@ func TestIsNodeVolumeDetachingAllowed(t *testing.T) {
},

Status: clusterv1.MachineStatus{
Conditions: clusterv1.Conditions{
{
Type: clusterv1.VolumeDetachSucceededCondition,
Status: corev1.ConditionFalse,
LastTransitionTime: metav1.Time{Time: time.Now().Add(-(time.Second * 60)).UTC()},
},
Deletion: clusterv1.MachineStatusDeletion{
WaitForNodeVolumeDetachStartTime: &metav1.Time{Time: time.Now().Add(-(time.Second * 60)).UTC()},
},
},
},
Expand All @@ -1922,12 +1906,8 @@ func TestIsNodeVolumeDetachingAllowed(t *testing.T) {
NodeVolumeDetachTimeout: &metav1.Duration{Duration: time.Second * 60},
},
Status: clusterv1.MachineStatus{
Conditions: clusterv1.Conditions{
{
Type: clusterv1.VolumeDetachSucceededCondition,
Status: corev1.ConditionFalse,
LastTransitionTime: metav1.Time{Time: time.Now().Add(-(time.Second * 30)).UTC()},
},
Deletion: clusterv1.MachineStatusDeletion{
WaitForNodeVolumeDetachStartTime: &metav1.Time{Time: time.Now().Add(-(time.Second * 30)).UTC()},
},
},
},
Expand All @@ -1947,12 +1927,8 @@ func TestIsNodeVolumeDetachingAllowed(t *testing.T) {
Bootstrap: clusterv1.Bootstrap{DataSecretName: ptr.To("data")},
},
Status: clusterv1.MachineStatus{
Conditions: clusterv1.Conditions{
{
Type: clusterv1.VolumeDetachSucceededCondition,
Status: corev1.ConditionFalse,
LastTransitionTime: metav1.Time{Time: time.Now().Add(-(time.Second * 1000)).UTC()},
},
Deletion: clusterv1.MachineStatusDeletion{
WaitForNodeVolumeDetachStartTime: &metav1.Time{Time: time.Now().Add(-(time.Second * 1000)).UTC()},
},
},
},
Expand Down
Loading