-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log flooded with "dns_get_record(): A temporary server error occurred" #34205
Comments
Same as #28105 ? |
Same error here for latest stable docker image.
No updates, no apps, no connection to any external resources. |
Is this still happening? |
362 entries in the last 8 days |
as far as I understand this is when the Guzzle http client library connects to an external server, could be a federated share but also the update notification or the internet connection check ? |
It's the reference link previews of talk, text and others |
Still happening as of Nextcloud 25.0.4
That's an example from the logs I had. Another one, with the full message of another error when trying to reach github.com (as it seems).
|
Still a thing in 25.0.7 and 26.0.2 |
tldr:
I think there are two "problems" here, both upstream based on my review of upstream code (php-src). I say "problems" surrounded by quotes because it's not clear the second one is even an actual problem. 1.
|
My gut is telling me the the inconsistent errno may be a red herring in the end, but I haven't dug into what would account for the errno changing in between calls. Possibly some DNS resolver caching/etc. where in one case it isn't certain it's more than a temporary failure and in another situation it's able to conclude "nah this really doesn't exist". It would be a bigger deal if we weren't getting false back from dns_get_record when expected... Clarification about how I'm able to consistently reproduce the behavior: The "A temporary server error occurred" message goes away when I do an nslookup in between on the same non-existent domain. If I don't, it continues. And to repeat the same behavior (i.e. the noisy output) you have to pick a new non-existent domain once you've gone through the dns_get_record/nslookup/dns_get_record sequence each time. Upstream (PHP): I've also added a (briefer) comment on the upstream bug report. |
This question has been around for a long time. Suddenly the source of the problem was discovered. Can be reproduced. The router's "Reverse DNS Lookup Timeout" can use the network normally and is not easily noticed. |
Guess: nextcloud and php, the system acquires and verifies the validity of the remote host's IP address by performing a double DNS lookup.
|
I'm getting these errors occassinaly on 28.0.5 aswell. Typically look like this: {
"reqId": "uAUQFEwikg5QBkd5s9AK",
"level": 3,
"time": "2024-04-26T01:57:20+00:00",
"remoteAddr": "",
"user": "--",
"app": "PHP",
"method": "",
"url": "--",
"message": "dns_get_record(): A temporary server error occurred. at /srv/www/nextcloud/lib/private/Http/Client/DnsPinMiddleware.php#111",
"userAgent": "--",
"version": "28.0.5.1",
"data": {
"app": "PHP"
},
"id": "6638a4774308c"
} It does not happen often, only once since update to 28.0.5 but they always come in a bunch (8 last time). Last time they also where followed by a warning: "LocalServerException No DNS record found for apps.nextcloud.com" which details:
I do not know how to trigger them, seems to happen randomly (like 4am when probably noone is using the server actively). Let me know if you want configuration or something else! |
My solution:Create Dockerfile & 99-fix-dns-get-record-server-error.sh Dockerfile:
99-fix-dns-get-record-server-error.sh:
Next build image and use instead of your old nextcloud image.
|
@skl256 By chance do either of the following trigger the failure in your environment: Either running this within the app container:
Or something like this in a dedicated (throwaway)
I've seen some reports around of failures when php+dns_get_record+docker+localhost+CNAME are combined. I can't reproduce it in my environments, but there are multiple reports so perhaps a buggy Docker Engine version or something. |
Still occurring on 29.0.1. Or rather, only since the update. Maybe I never looked at the admin panel before when it happened.
But |
Same issue at 29.0.4 |
I've checked in a standard installation without docker, the command you gave will also raise the same error :
Btw, asking to search a CNAME for |
Hm.. run but run At the same time, the DNS server on the router 192.168.2.1 is set to 192.168.0.1 |
@skl256 So, with a little dusting on tcpdump, and changing the /etc/resolv.conf file of the webserver
=> Result from the php command :
=> Result from the php command :
=> Result from the php command : All will answer NXDomain to a But for a Knowing what to look for, according to the RFC 1912 => 4.1 :
There are other discussions about this, like on Stackoverflow. Btw, it seems the php implementation is also not valid, an error should be raised when an empty answer is given. @joshtrichards |
@Daryes Does that mean adding some manual handling for |
It seems a custom handler for localhost is doing the job, the log went silent when I plugged this in :
I'm waiting some more days to confirm, still |
Is an empty answer the correct answer for localhost? |
@come-nc This time, those are from dnsGetRecord(), a wrapper for dns_get_record() As there are different dns functions, while a manual handling of some reserved name, starting with "localhost" might do the trick, it seems knowledge of nextwork working is required. |
is this fixed yet? im having the same problem |
Still an issue at 29.0.7
|
I had the same problem on a debian 12 machine with systemd-resolved in stub mode, the problem was the libnss-resolve package. |
Maybe everyone who is still getting these errors could edit their comments to add what OS/environment they are running? And according to above comment if they have libnss-resolve installed? |
I’m using an official docker fpm alpine image |
Ubuntu 22.04, installed from the official ISO image, libnss-resolve is not installed
so the default config, nothing related to the linked issue in systemd repo, which seems limited to cloud images. Btw, I already gave an explanation about why this occurs : #34205 (comment)
(the . at the end of localhost is not an error, all dns clients will add this, to be able to search on different domain suffixes, stated by the "search" parameter in resolv.conf, even if empty) And you can expect many company using Active Directory for managing user accounts, which make it Windows DNS Server mandatory. To fix this, having "localhost" a special case returning immediately an empty answer seems working, instead of passing to the php function |
The text was updated successfully, but these errors were encountered: