-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added cleanup meachnism for tendrl setup #129
base: master
Are you sure you want to change the base?
Conversation
@nthomas-redhat @mbukatov @GowthamShanmugam @anmolsachan @cloudbehl please review |
cleanup.yml
Outdated
roles: | ||
- tendrl-server-cleanup | ||
|
||
- hosts: gluster_servers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel stopping storage nodes services first and then stop server is safer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in next update.
The cleanup playbook performs the below steps - Stops collectd, tendrl-node-agent and tendrl-gluster-integration on storage nodes - Stops tendrl-node-agent, tendrl-monitoring-integration, tendrl-notifier, tendrl-api on tendrl server node - Stops grafana-server, carbon-cache and etcd services on storage nodes - Cleans up etcd, grafana and carbon data files tendrl-bug-id: Tendrl/issues/63 Signed-off-by: Shubhendu <[email protected]>
LGTM |
name: httpd | ||
state: stopped | ||
|
||
- name: Cleanup etcd data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it destructive to remove the ETCD? User may have different namespace running for different things can we just delete the data?
Doing it for the dev purpose makes sense to me but doing this same on production server of a client, I am a bit skeptical about it.
thoughts? @shtripat @nthomas-redhat @anmolsachan @GowthamShanmugam
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are sure then the customer doesn't have anything else running on the tendrl server machine then we can take this path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We save tendrl data in default profile. That itself is not very correct I feel.
We can have a disclaimer in document regarding the same, that it will flush all the etcd data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct I agree shubhendu. IF some other data in default profile that will also get flushed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be another option to delete all tendrl created dirs under etcd like /clusters
, /nodes
and tags etc. After this keep etcd running as usual and tendrl-ansible shouldnt cause issues. We will need to test the scenario though.
path: "{{etcd_db_path}}/" | ||
state: absent | ||
|
||
- name: Cleanup grafana data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here as well, I feel we are using default location to save tendrl data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if someone is having other dashboards on grafana which are not related to tendrl. By cleaning grafana I think those will be lost too and then there will be no recovery for it.
I think grafanadb file will be common for all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here as well we can try to remove only tendrl specific dashboards and keep grafana-server running. Need to verify though.
path: "{{grafana_db_file}}" | ||
state: absent | ||
|
||
- name: Cleanup carbon data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of carbon we will delete only tendrl specific directory and nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I missed that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm providing few suggestions/question in comments below.
But main question is, what purpose this cleanup has. Because it cleans up the data, but keeps the configuration files intact, which means that it is not guaranteed that it will be always possible to install tendrl again via tendrl-ansible as if it was done from scratch.
Also make sure that the Travis CI checks are passing (yamlling and ansible-lint).
``` | ||
$ ansible-playbook -i inventory_file cleanup.yml | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I would not repeat all the details already mentioned in the installation section to avoid duplication a bit, and make this little more succinct (after all, one needs to install first to do the cleanup). We could point out that the same ideas as for installation apply.
grafana_db_file: "/var/lib/grafana/grafana.db" | ||
|
||
# carbon data to be deleted | ||
carbon_data_path: "/var/lib/carbon/whisper/tendrl/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather moved these default values into roles/tendrl-ansible.tendrl-server-cleanup/defaults/main.yml
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
@@ -317,6 +328,63 @@ tendrl-ansible: | |||
on [TEN-257](https://tendrl.atlassian.net/browse/TEN-257)). | |||
## How do I cleanup Tendrl setup using tendrl-ansible? | |||
1) If you use tendrl-ansible from rpm package, copy `cleanup.yml` playbook into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main information we are missing here is what does this cleanup mean and why would one want to do one. This helps to convey context of this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it seems that the cleanup deletes all tendrl internal data (etcd, graphite, ...). It seems to me that this needs to be highlighted here again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
- name: Stop tendrl services | ||
service: | ||
name: tendrl-node-agent | ||
state: stopped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we are not stopping tendrl-api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Will add tendrl-api as well.
The cleanup playbook performs the below steps
on storage nodes
tendrl-api on tendrl server node
tendrl-bug-id: /issues/63
Signed-off-by: Shubhendu [email protected]