[BUG] Cassandra node gets decommissioned forever if scaling is partially done #410

srteam2020 · 2021-08-17T22:25:28Z

Describe the bug

When scaling down, the Cassandra operator always decommissions a Cassandra node (or a Cassandra pod) before deleting the pod. However, we find sometimes the Cassandra node can be left in a decommissioned state without being deleted forever when the Cassandra operator misses certain events.

The scaling down logic is implemented as follows:

if desiredSpecReplicas < currentSpecReplicas {
	...
	if len(decommissionedNodes) == 0 {
		// decommission one Cassandra node (pod)
	} else if len(decommissionedNodes) == 1 {
		// delete the decommissioned node (pod)
	}
}

Assume we have a Cassandra datacenter with three (currentSpecReplicas) nodes and the user wants to scale to two (desiredSpecReplicas). When seeing desiredSpecReplicas < currentSpecReplicas, the operator first finds there is no decommissioned node (len(decommissionedNodes) == 0), so it will decommission one of the Cassandra nodes and finishes this reconcile. Ideally, the operator is supposed to delete the decommissioned node in the next reconcile.

However, if the user changes the replica back to three before the operator enters the next reconcile (this can happen when the operator runs slow or encounters a crash), the operator will find that desiredSpecReplicas == currentSpecReplicas in the next reconcile, and the decommissioned node will not be deleted. Thus, the node is left in the decommissioned state forever until the user issues another scale down later. There will be only two Cassandra nodes functioning, though the stateful set still hosts three Cassandra nodes (pods).

To Reproduce

Steps to reproduce the behavior:

Create a Cassandra datacenter with three replicas.
Scale down: three -> two. The operator decommissions the node, but has not deleted the pod yet
Scale up: two -> three. The operator finds desiredSpecReplicas == currentSpecReplicas and leaves the node decommissioned.

Expected behavior
The operator should check whether any node is decommissioned and bring back the node if it is not supposed to be deleted.

Environment

OS Linux
Kubernetes version v1.18.9
kubectl version v1.20.1
Go version 1.13.9
Cassandra version 3

Additional context
We are willing to help fix this bug. One potential fix is to delete the pod where the node is decommissioned. Since the pod is hosted by the statefulset, the pod will be automatically recreated and get out of the decommissioned state.

The text was updated successfully, but these errors were encountered:

srteam2020 added the bug Something isn't working label Aug 17, 2021

srteam2020 assigned smiklosovic Aug 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Cassandra node gets decommissioned forever if scaling is partially done #410

[BUG] Cassandra node gets decommissioned forever if scaling is partially done #410

srteam2020 commented Aug 17, 2021

[BUG] Cassandra node gets decommissioned forever if scaling is partially done #410

[BUG] Cassandra node gets decommissioned forever if scaling is partially done #410

Comments

srteam2020 commented Aug 17, 2021