You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When scaling down, the Cassandra operator always decommissions a Cassandra node (or a Cassandra pod) before deleting the pod. However, we find sometimes the Cassandra node can be left in a decommissioned state without being deleted forever when the Cassandra operator misses certain events.
Assume we have a Cassandra datacenter with three (currentSpecReplicas) nodes and the user wants to scale to two (desiredSpecReplicas). When seeing desiredSpecReplicas < currentSpecReplicas, the operator first finds there is no decommissioned node (len(decommissionedNodes) == 0), so it will decommission one of the Cassandra nodes and finishes this reconcile. Ideally, the operator is supposed to delete the decommissioned node in the next reconcile.
However, if the user changes the replica back to three before the operator enters the next reconcile (this can happen when the operator runs slow or encounters a crash), the operator will find that desiredSpecReplicas == currentSpecReplicas in the next reconcile, and the decommissioned node will not be deleted. Thus, the node is left in the decommissioned state forever until the user issues another scale down later. There will be only two Cassandra nodes functioning, though the stateful set still hosts three Cassandra nodes (pods).
To Reproduce
Steps to reproduce the behavior:
Create a Cassandra datacenter with three replicas.
Scale down: three -> two. The operator decommissions the node, but has not deleted the pod yet
Scale up: two -> three. The operator finds desiredSpecReplicas == currentSpecReplicas and leaves the node decommissioned.
Expected behavior
The operator should check whether any node is decommissioned and bring back the node if it is not supposed to be deleted.
Environment
OS Linux
Kubernetes version v1.18.9
kubectl version v1.20.1
Go version 1.13.9
Cassandra version 3
Additional context
We are willing to help fix this bug. One potential fix is to delete the pod where the node is decommissioned. Since the pod is hosted by the statefulset, the pod will be automatically recreated and get out of the decommissioned state.
The text was updated successfully, but these errors were encountered:
Describe the bug
When scaling down, the Cassandra operator always decommissions a Cassandra node (or a Cassandra pod) before deleting the pod. However, we find sometimes the Cassandra node can be left in a decommissioned state without being deleted forever when the Cassandra operator misses certain events.
The scaling down logic is implemented as follows:
Assume we have a Cassandra datacenter with three (
currentSpecReplicas
) nodes and the user wants to scale to two (desiredSpecReplicas
). When seeingdesiredSpecReplicas < currentSpecReplicas
, the operator first finds there is no decommissioned node (len(decommissionedNodes) == 0
), so it will decommission one of the Cassandra nodes and finishes this reconcile. Ideally, the operator is supposed to delete the decommissioned node in the next reconcile.However, if the user changes the replica back to three before the operator enters the next reconcile (this can happen when the operator runs slow or encounters a crash), the operator will find that
desiredSpecReplicas == currentSpecReplicas
in the next reconcile, and the decommissioned node will not be deleted. Thus, the node is left in the decommissioned state forever until the user issues another scale down later. There will be only two Cassandra nodes functioning, though the stateful set still hosts three Cassandra nodes (pods).To Reproduce
Steps to reproduce the behavior:
desiredSpecReplicas == currentSpecReplicas
and leaves the node decommissioned.Expected behavior
The operator should check whether any node is decommissioned and bring back the node if it is not supposed to be deleted.
Environment
kubectl version
v1.20.1Additional context
We are willing to help fix this bug. One potential fix is to delete the pod where the node is decommissioned. Since the pod is hosted by the statefulset, the pod will be automatically recreated and get out of the decommissioned state.
The text was updated successfully, but these errors were encountered: