Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Interesting fork of Ansible Playbook by remil1000 #37

Open
peterzhuamazon opened this issue Feb 4, 2022 · 7 comments
Open

Comments

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Feb 4, 2022

Hi all,

@saravanan30erd I have noticed @remil1000 has a interesting fork which change a lot of the structure based on our initial setup of ansible playbook, even the support of Debian 11 is added.
https://github.com/remil1000/opensearch-ansible-playbook

@remil1000 we would love to have your changes port back in the upstream we have here.
Is it possible for you to contribute directly to this repo?

Thanks.

cc: @stockholmux

@bbarani
Copy link
Member

bbarani commented Feb 4, 2022

@remil1000, we would like to invite you to join us and contribute to OpenSearch Ansible repository in order to create a robust Ansible playbook for the OpenSearch community. Please feel free to reach out to us for any additional information.

@saravanan30erd
Copy link
Collaborator

saravanan30erd commented Feb 5, 2022

@peterzhuamazon yes, it has interesting changes. But it forked from our old state when we supported only single node installation, and he added multi-node (cluster) installation setup and other changes. Since we added multi-node feature and supporting many OS distros now, it might create lot of conflicts if we try to merge those changes directly. Instead @remil1000 can add missing features and changes one by one here.

@remil1000
Copy link

Hi,
you're right about the fork & feature addition
I started working on this back in November 2021 and some unexpected delays lead to this very large drift and now the issue of a near impossible merge between your work and mine

a few items I could try to backport are the following:

  • PKI management using Ansible modules and no external tools (I have a somehow hard requirement to set this up in airgapped environments)
  • bcrypt from plain text password with static salt (idempotency) using Ansible module
  • some tweaks for cluster initialization (initial_master_nodes)
  • idempotency of the whole role

let me know how you would like to proceed, if you have any priorities or current pain points to work on

@saravanan30erd
Copy link
Collaborator

@remil1000 thanks for the response. we don't have any specific priorities or pain points so you can work on all those above 4 points. Suggestion from my side is, create separate PR for every feature and please make sure the newly added changes supports all our current supported distros (Centos7, RHEL7, Amazon Linux2, Ubuntu 20.04) with minimal complexity.

@peterzhuamazon
Copy link
Member Author

I second on @saravanan30erd responses and fully welcome your changes @remil1000.
Please let us know if you have any questions or concerns that we can help with.
Thanks.

@remil1000
Copy link

I'm in the process of installing the upstream/official role with the multi-node cluster type but I'm a bit lost:

  • it seems the whole configuration is changed on each run (security config removed then added back by two blockinfile) - https://github.com/opensearch-project/ansible-playbook/blob/main/roles/linux/opensearch/tasks/security.yml#L64-L80 - making the whole cluster restart on every run
  • also it seems there is no possibility of adding a node to a running cluster as the only time the certificate generation tool is triggered is on /tmp/opensearch-nodecerts creation on the controller
  • and as the /tmp/opensearch-nodecerts is wiped at the end of the run aren't the whole PKI and all nodes certificates/keys rerolled on each run ? are you handling any host verification on your client (API, logstash etc ...) ? as the CA may change at any moment
  • last item, but already discussed, as soon as the opensearch tar.gz is missing or an update is performed the whole config is wiped which could lead to nasty broken clusters with no easy rollback for any unaware user

also would it be possible to use a common prefix for the role variables ? Ansible has no namespace, it's quite an issue, so if for any reason other roles or users's playbooks are using variables like ip, roles, cluster_type or admin_password they would have quite a hard time

Are any current users of the role facing the same issues ? especially with idempotency, upgrades and cluster extend

@saravanan30erd
Copy link
Collaborator

saravanan30erd commented Feb 11, 2022

@remil1000 I created the role initially focused only for first time installation, didn't have a time to support upgrade and cluster extend features. so it works perfectly only for cluster creation and doesn't support upgrade (re-run) and cluster extend. To support these, we need to work on the points you mentioned above(especially certs).

Regarding certs, I am currently using searchguard tool to create certs and configuration templates which is later added to main configuration. I see you have already plan to use Ansible modules for PKI management so we can replace the current mechanism with that and try to use the same to support the upgrade or re-run. Otherwise we can use the same external tool, and modify to support re-run and node addition as this tool supports to not overwrite if the certs already present in outputs folder and I never tried it since still didn't start to work for upgrade or re-run support.

Regarding variable prefix, sure we can do that. Some of the variables already have prefix os_ ,dashbaords_ so we can make it as standard for all variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

4 participants