Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all_nodes.py is killing my ansible worker #635

Open
jbe33 opened this issue May 7, 2024 · 1 comment
Open

all_nodes.py is killing my ansible worker #635

jbe33 opened this issue May 7, 2024 · 1 comment

Comments

@jbe33
Copy link
Contributor

jbe33 commented May 7, 2024

Hello,
We are encountering an issue with the Ansible tasks in the linux_update_etc_hosts.yml file. We use inventories that include several hundred machines, and when calling these plays, we spend a lot of time parsing the entire inventory to construct the /etc/hosts list (which doesn't add much value). Our Ansible machines have limited RAM, and we frequently encounter the error "A worker was found in a dead state."

We've noticed that we achieve the same result much faster and more efficiently by replacing the call to the all_nodes.py collection with the call to the pg_sr_cluster_nodes collection.

Old:

- name: Build hosts_lines, based on the inventory
  ansible.builtin.set_fact:
    hosts_lines: >
      {{ hosts_lines | default([]) + [
        {
          'line': item.private_ip + ' ' + item.inventory_hostname,
          'regexp': '.*\s' + item.inventory_hostname | regex_escape() + '$'
        }
      ] }}
  loop: "{{ lookup('edb_devops.edb_postgres.all_nodes', wantlist=True) }}"

New:

- name: Build hosts_lines, based on the inventory
  ansible.builtin.set_fact:
    hosts_lines: >
      {{ hosts_lines | default([]) + [
        {
          'line': item.private_ip + ' ' + item.inventory_hostname,
          'regexp': '.*\s' + item.inventory_hostname | regex_escape() + '$'
        }
      ] }}
  loop: "{{ lookup('edb_devops.edb_postgres.pg_sr_cluster_nodes', wantlist=True) }}"

We would like to know your thoughts on this modification. Another approach could be to bypass this build_host_lines step using a when condition.

I can submit a pull request (PR) if the solution works for you.

Thank you.

@vibhorkumar123
Copy link
Contributor

We used all_nodes because it allows communication between multiple nodes by adding node information in /etc/hosts. If you go with pg_sr_cluster_nodes, then if you want to deploy backup nodes and monitoring nodes as part of the deployment, the primary/standbys won't be able to connect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants