Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incompatible with podman-compose - alpine musl gethostbyname and gethostbyaddr only return single ip #672

Open
1 task
coredump17 opened this issue Apr 25, 2023 · 2 comments

Comments

@coredump17
Copy link
Contributor

Describe the bug

Podman 4 has DNS and 'should' be compatible withh docker. podman-compose up -d shows below error for containers configuration and autostarter.

File "core/autostarter.py", line 269, in core.autostarter.AutostarterWorker.check_and_control_services
File "/usr/local/lib/pyenv/versions/3.6.8/lib/python3.6/site-packages/artemis_utils/service.py", line 37, in service_to_ips_and_replicas_in_compose
replica_name = "{}-{}".format(base_service_name, replica_name_match.group(1))
AttributeError: 'NoneType' object has no attribute 'group'

upon investigation it would appear that this issue is caused by podman having multiple PTR records for an IP- container id, container name. alpine, which uses musl, only returns one host or IP per call which is unexpected, as you would only find one replica. PTR lookups would never match the container ID in podmans case as it has multiple entries - see below:

bash-4.4# dig +short configuration
10.89.0.97

bash-4.4# dig +short -x 10.89.0.97
artemis_configuration_1.
configuration.
5cf0e12b159a. <---- this will always be returned

The above PTR lookup will not match the below regex in 'service_to_ips_and_replicas_in_compose' call.

    r"^"
    + re.escape(COMPOSE_PROJECT_NAME)
    + r"[_|-]"
    + re.escape(base_service_name)
    + r"[_|-](\d+)",
    replica_host_by_addr,
  )

If dns calls using the socket module only return one value every time, i believe this would limit the platform to one replica.

Affected Component(s)

  • Back-End (configuration, autostarter)

To Reproduce
Steps to reproduce the behavior:

  1. centos/rocky/redhat/alma OS 8+
  2. yum install podman
  3. enable epel repo
  4. yum install podman compose
  5. ** pull repo, cd artimis ; podman-compose up -d
  6. ui starts but admin/system page errors
  7. podman logs configuration or podman logs autostarter show error

Expected behavior
i would expect a DNS lookup to respond with all entries not just the last one. i believe this would impact the replica set if wishh to run > 1 container of the same kind. It also caused the solution to be incompatible with podman.

Screenshots
File "core/autostarter.py", line 269, in core.autostarter.AutostarterWorker.check_and_control_services
File "/usr/local/lib/pyenv/versions/3.6.8/lib/python3.6/site-packages/artemis_utils/service.py", line 37, in service_to_ips_and_replicas_in_compose
replica_name = "{}-{}".format(base_service_name, replica_name_match.group(1))
AttributeError: 'NoneType' object has no attribute 'group'

System (please complete the following information):

  • OS: rocky 9
  • Browser chrome/ edge

Additional context
alpine uses musl which acts differently to glibc and appears to only return one dns entry per lookup.

@coredump17
Copy link
Contributor Author

To workaround the above issue we can use dnspython to perform our dns lookups.

copy /usr/local/lib/pyenv/versions/3.6.8/lib/python3.6/site-packages/artemis_utils/service.py locally. The below code replaces ' service_to_ips_and_replicas_in_compose' with a dnspython definition

artemis_utils will require dnspython module

`
def resolve_dns(query:str, rtype:str = 'A', timeout:int = 2)->list:
rtype.upper()
resolver = dns.resolver.Resolver()
if rtype == "PTR":
query = dns.reversename.from_address(query)
msg = dns.message.make_query(query,rtype)
for dns_server in resolver.nameservers:
try:
resp = dns.query.udp(msg,dns_server,timeout=timeout)
if resp.answer:
return [str(a) for a in resp.answer[0] ]
except Exception as e:
log.error("error:",dns_server, e)
return []

def service_to_ips_and_replicas_in_compose(own_service_name, base_service_name):
local_ip = get_local_ip()
address_regexp = re.compile ('\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}')
service_to_ips_and_replicas_set = set([])
addr_infos = resolve_dns(base_service_name)
for replica_ip in addr_infos:
# do not include yourself
if base_service_name == own_service_name and replica_ip == local_ip:
continue
ptr = resolve_dns(replica_ip, 'PTR')
for replica_host_by_addr in ptr:
replica_name_match = re.match(
r"^"
+ re.escape(COMPOSE_PROJECT_NAME)
+ r"[_|-]"
+ re.escape(base_service_name)
+ r"_|-",
replica_host_by_addr,
)
if replica_name_match:
replica_name = "{}-{}".format(base_service_name, replica_name_match.group(1))
service_to_ips_and_replicas_set.add((replica_name, replica_ip))
return service_to_ips_and_replicas_set
`

@vkotronis
Copy link
Member

@mooneym17 thanks for reporting this! Could you issue a PR with the potential fix? We will also update Artemis utils accordingly with the fix. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants