Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator does not recreate Databases and Tables on replicas if not scaled to 0 before deleting the StatefulSet #1500

Open
kaitimmer opened this issue Sep 2, 2024 · 0 comments

Comments

@kaitimmer
Copy link

We use release-0.23.7 to manage our Clickhouse instances.

Currently working on migrating all Clickhouse servers to a different storage class. This is what we do:

  1. Change the storageClass and PVC size (making sure it is large enough for the data in the replica) in the ClickhouseInstallation
  2. Delete the replicas StatefulSet and PVC and PV
  3. Have the Operator recreate it

Usually, this works fine. The replica is created on a new PVC with the new settings, and the data is synced back from the remaining replicas. We do this one after another until we are entirely running on the new storage.

However, this never works for replica 0 (I've also seen it for replicas >0, but only sometimes). The new StatefulSet is created with a new PVC, but the database and tables are not created and, therefore, are not synced.

In these cases, the Operator somehow ends up here:

"No need to add tables on host %d to shard %d in cluster %s",
but we do not understand how this happens.

What works in these scenarios is this:

  1. Scale the Operator to 0
  2. Delete the StatefulSet and PVC/PV
  3. Restart the Operator

In this case, the operator reconciles the installation fine and creates the database and tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant