-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to docker-ce-cli 5:27.0.3 breaks nomad #23523
Comments
Hi @ebarriosjr! Nomad doesn't use the Docker CLI. From the package version number you've got there, I'm assuming you're using a downstream distribution and not Docker's own package? If I look at docker/cli@v27.0.2...v27.0.3 I see that they vendored the main moby/moby project at v27.0.3. And then if I look at the release notes for v27.0.3 I see some interesting suspects. So my guess is that |
For what it's worth, I've upgraded my local environment to 27.0.3 and tested out a Nomad job with networking and wasn't able to reproduce any problems. Maybe there's something specific to your client configuration or job that you could share? output of docker version
The other weird item here is this error |
Yesterday, after building a new nomad client, I've found that the connect envoy side-car ports are not being published correctly. From what I can see, the other clients were running 26.X of docker-ce and the new one is running 27.X. The other clients had packages updates (mostly kernel and docker to 27.X and they've also started failing in the same way). Happy to supply any info - from what I can see iptables has the entries for the allocations/ports, but getting connection refused. The client was running 1.7.7, but have upgraded to 1.8.1, but still seeing the same issue. I'm going to try and downgrade docker to see if it helps and will get back Matt |
Any chance you upgraded the host distro at the same time? There's an open issue around the |
Hi @tgross, the output of my
|
Weird that your client and server don't match. But the server looks identical to what I've posted above. Any thoughts about the networking discussion above? |
Thats because i reverted the version of docker-ce-cli to 27.0.2. On 27.0.3 all the jobs that i have running on nomad stop working with the missing network error. |
Assuming this was aimed at me.. I'm running Debian bookworm, which definitely hasn't changed. As I say, it could be something completely unrelated, but a port-forwarding issue would presumably be a nomad client-related issue (as opposed to nomad servers, consul etc. related) and all the clients did so after they were rebooted and the only thing that had changed were package updates (plus a re-install, which included the latest docker version). I'm just following up on the downgrade to see if it helped :) Matt Edit: No, the downgrade didn't help - so probably completely unrelated. Apologies, I'll continue my investigation Edit edit: Yes, please completely ignore me - mine was actually the connect PKI root CA expiring (but happened during a powerdown, so the affect was quite different - envoy would start "happily" without any errors/warnings, but just didn't listen on any of the service ports!) |
Ok, thanks @MatthewJohn. So @ebarriosjr that leaves the networking, as I mentioned earlier:
#23583 suggests that something may have changed in the environment where the bridge kernel module is unavailable, but I'd expect to see a network still. For us to make further progress on this we'll need information from you on the network fingerprint (and/or client logs from the network fingerprinting), whether the distro has been updated, whether the kernel module is present, etc. |
Hi @tgross, I just upgraded my system to the latest packages and the issue is gone. Maybe it was related to the bridge issue. |
Nomad version
Nomad v1.8.1
BuildDate 2024-06-19T06:43:57Z
Revision 5022543
Operating system and Environment details
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
Issue
After upgrading docker-ce-cli from 5:27.0.2 to 5:27.0.3 nomad breaks. No containers were deployed. Some of them had the issue:
Constraint "missing network": 1 nodes excluded by filter
, others were trying to use ipv6 instead of ipv4.Reproduction steps
Update docker-ce-cli to version 5:27.0.3 and reboot.
Expected Result
Nomad would be able to spawn docker container without issue.
Actual Result
No container could be started
The text was updated successfully, but these errors were encountered: