Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows 11上0.9.0版本运行时间长后会查询超时 #431

Open
debugg-a opened this issue Nov 7, 2024 · 21 comments
Open

Windows 11上0.9.0版本运行时间长后会查询超时 #431

debugg-a opened this issue Nov 7, 2024 · 21 comments

Comments

@debugg-a
Copy link

debugg-a commented Nov 7, 2024

OS:Windows 11 23H2
smardns版本:0.9.0
现象:在系统不关机持续运行一段时间后,出现DNS查询超时
image
打印的日志如下:
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:
overflow when subtracting duration from instant
thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:

@mokeyish
Copy link
Owner

mokeyish commented Nov 7, 2024

有没有能复现的最下配置,可能上游服务器,比如近期的阿里的限流,超时等原因导致的。

@debugg-a
Copy link
Author

debugg-a commented Nov 8, 2024

# 配置 bootstrap-dns,如不配置则调用系统的
server https://1.12.12.12/dns-query -bootstrap-dns -exclude-default-group
# 配置国内上游服务器
# TencentDNS
server https://doh.pub/dns-query -group doh-tencent -exclude-default-group
# AliDNS
server https://dns.alidns.com/dns-query -group doh-alidns -exclude-default-group
# 360DNS
server https://doh.360.cn/dns-query -group doh-cn -exclude-default-group
# OneDNS
server https://doh-pure.onedns.net/dns-query -group doh-cn -exclude-default-group
# 配置国外上游服务器
server tls://9.9.9.9
server tls://1.0.0.1
# 配置国内的域名国内DNS解析
nameserver /domain-set:set-alibaba/doh-alidns
nameserver /domain-set:set-tencent/doh-tencent
nameserver /domain-set:set-cn/doh-cn

@xshzr
Copy link

xshzr commented Nov 9, 2024

同样0.9.0,windows10遇到同样问题。
屏948
smartdns刚启动时查询没问题,过几分钟到十几分钟(没有准确的数值但是很短。),再查询就开始超时。
服务里单纯重启smartdns,无效。
关闭Smartdns,删除硬盘里的缓存文件,再启动Smartdns。又正常了。然后过一会还会复现。
怀疑和缓存有关。

@mokeyish
Copy link
Owner

mokeyish commented Nov 9, 2024

@xshzr 你禁用缓存看看,试一段时间看看?如果是的话,我着重检查这块代码。

我放我自己的 windows 上跑,也改了日志打印,观察,6 个小时都没出现这种问题,可能电脑核心数太多了😅

@xshzr
Copy link

xshzr commented Nov 9, 2024

设置了cache-size 0后,还别说,一下午都没有出现超时。。

@mokeyish
Copy link
Owner

mokeyish commented Nov 9, 2024

设置了cache-size 0后,还别说,一下午都没有出现超时。。

那开启缓存,禁掉域名预读取呢?

@xshzr
Copy link

xshzr commented Nov 10, 2024

之前开启缓存时的相关配置,预读取已经是关闭的:
cache-size 30000
cache-file O:/smartdns/cache/dnscache.txt
cache-persist yes
cache-checkpoint-time 10800
prefetch-domain no
serve-expired yes
serve-expired-ttl 10000
serve-expired-reply-ttl 3
#serve-expired-prefetch-time 43200
rr-ttl-min 300

@mokeyish
Copy link
Owner

那可能是 messageType 错了,从缓存读取是要根据 DNS记录重建 Message 的,昨天看到这个错误顺便改了,然后我自己一边一直监控着,哪怕改成单个线程也没有这种问题(但这里说的是关于时间,超时的问题,也不太像 messageType 导致的)。

晚点我提交这个修复看看。

@mokeyish
Copy link
Owner

@xshzr
Copy link

xshzr commented Nov 10, 2024

#433

下载 https://github.com/mokeyish/smartdns-rs/actions/runs/11762199344

修复了,此贴以上提到的问题,我这里没有再出现。

不过我还有一个老问题,好几个版本都一样,就是安卓手机端,设置dns为开启smartdns服务的电脑ip地址后,部分解析就会出问题。比如微信公众号文章可以打开,但里面的图片无法显示,咸鱼可以打开,但是咸鱼的签到页面一直显示页面加载失败,咸鱼的图片可以打开,视频无法播放。

@debugg-a
Copy link
Author

debugg-a commented Nov 11, 2024

#433

下载 https://github.com/mokeyish/smartdns-rs/actions/runs/11762199344

该版本开启缓存,开始预读,不到3效时开始超时。

cache-size 32768
cache-persist yes

image

@debugg-a
Copy link
Author

#433
下载 https://github.com/mokeyish/smartdns-rs/actions/runs/11762199344

该版本开启缓存,开始预读,不到3效时开始超时。

cache-size 32768
cache-persist yes

image

开启缓存
image

@xshzr
Copy link

xshzr commented Nov 11, 2024

昨天下午好像确实没有遇到之前超时的问题,我也觉的好了,晚上关机了,今天开机后那个mp.weixin.qq.com又超时了,重启smartdns服务也没用。我的缓存是cache-checkpoint-time定期保存到硬盘上的。不知道是不是第二天读取硬盘的缓存出现什么问题?

这是今天重启smartdns后的一段日志:

2024-11-11 11:55:05.973:INFO: _____ _ _____ _ _ _____
2024-11-11 11:55:05.973:INFO: / | | | | __ | \ | |/ |
2024-11-11 11:55:05.973:INFO: | (
_ __ ___ __ _ _ | | | | | | | | (

2024-11-11 11:55:05.973:INFO: _
| '
_ \ / _ | '| | | | | | . ` |_ \
2024-11-11 11:55:05.973:INFO: __) | | | | | | (| | | | | | |
| | |\ |) |
2024-11-11 11:55:05.973:INFO: |_____/|
| |
| |
|_
,|| _| |_____/|| _|_____/
2024-11-11 11:55:05.973:INFO:
2024-11-11 11:55:05.973:INFO: awaiting connections...
2024-11-11 11:55:05.973:INFO: server starting up
2024-11-11 11:55:06.181:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58204
2024-11-11 11:55:06.182:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58204
2024-11-11 11:55:06.182:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58204
2024-11-11 11:55:07.317:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.246:58356
2024-11-11 11:55:08.82:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655
2024-11-11 11:55:08.83:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655
2024-11-11 11:55:08.83:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655
2024-11-11 11:55:08.84:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655
2024-11-11 11:55:08.84:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655
2024-11-11 11:55:09.320:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.246:58357
2024-11-11 11:55:09.320:DEBUG:smartdns::app:363: Request: 11 src:udp://192.168.88.246#58357 type:QUERY dnssec:false QUERY:mp.weixin.qq.com.:AAAA:IN qflags:RD
2024-11-11 11:55:09.320:DEBUG:smartdns::app:374: Response: ; header 0:RESPONSE::NoError:QUERY:1/0/0
; query
;; mp.weixin.qq.com. IN AAAA
; answers 1
mp.weixin.qq.com. 300 IN SOA a.gtld-servers.net nstld.verisign-grs.com 1800 1800 900 604800 86400
; nameservers 0
; additionals 0
, Duration: 40µs
2024-11-11 11:55:09.321:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.246:58358
2024-11-11 11:55:10.614:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628
2024-11-11 11:55:10.614:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628
2024-11-11 11:55:10.615:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628
2024-11-11 11:55:10.615:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628
2024-11-11 11:55:10.616:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628
2024-11-11 11:55:11.335:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.246:58359
2024-11-11 11:55:11.335:DEBUG:smartdns::app:363: Request: 13 src:udp://192.168.88.246#58359 type:QUERY dnssec:false QUERY:mp.weixin.qq.com.:AAAA:IN qflags:RD
2024-11-11 11:55:11.335:DEBUG:smartdns::app:374: Response: ; header 0:RESPONSE::NoError:QUERY:1/0/0
; query
;; mp.weixin.qq.com. IN AAAA
; answers 1
mp.weixin.qq.com. 300 IN SOA a.gtld-servers.net nstld.verisign-grs.com 1800 1800 900 604800 86400
; nameservers 0
; additionals 0
, Duration: 33.5µs

@mokeyish
Copy link
Owner

我这不好复现,帮忙定位下是什么问题。

  1. 缓存预读取(禁掉预读取,开启缓存)
  2. 缓存(禁掉缓存)
  3. 测速(启用 fastest-response,这是最快响应,相当于不测速)

@debugg-a
Copy link
Author

我这不好复现,帮忙定位下是什么问题。

  1. 缓存预读取(禁掉预读取,开启缓存)
  2. 缓存(禁掉缓存)
  3. 测速(启用 fastest-response,这是最快响应,相当于不测速)

image

禁掉预读取,开启缓存------出现超时

@debugg-a
Copy link
Author

我这不好复现,帮忙定位下是什么问题。

  1. 缓存预读取(禁掉预读取,开启缓存)
  2. 缓存(禁掉缓存)
  3. 测速(启用 fastest-response,这是最快响应,相当于不测速)

image

禁掉预读取,开启缓存------出现超时

禁掉预读取,开启缓存,启用 fastest-response------出现超时
image

@mokeyish
Copy link
Owner

mokeyish commented Nov 11, 2024

再做个测试,把 server 带域名的,给它指定一个 ip, 比如:

server https://cloudflare-dns.com/dns-query?ip=1.1.1.1

看看还会不会超时

@debugg-a
Copy link
Author

debugg-a commented Nov 12, 2024

再做个测试,把 server 带域名的,给它指定一个 ip, 比如:

server https://cloudflare-dns.com/dns-query?ip=1.1.1.1

看看还会不会超时

image

快一个小时了,还没有出现超时。

//更新,一天了,到目前为止没有出现超时
image

@xshzr
Copy link

xshzr commented Nov 12, 2024

说下我的一个观察到的规律,我的使用环境是把smartdns设在个人电脑上的,电脑每天都要关机的。

每次电脑开机后smartdns就会出现查询超时现象,一开始我以为是上次关机前保存在硬盘里的catch有问题,所以关闭smartdns,删除catch,启动smartdns,当时能查询了,但是,几分钟后查询过的域名就会再次超时,直到过期缓存超时时间serve-expired-ttl 被触发,一切都正常了。再也不超时了。serve-expired-ttl值设置5分钟,就5分钟后正常,1小时,就一小时后正常。

然后发现即使不关闭电脑,重启smartdns服务,又会再次出现上面出现的超时情况。并且重启过程是否删除硬盘缓存,并不会有大区别,唯一的区别就是删缓存,则第一次查询成功然后几分钟后超时,不删缓存,则直接超时。

并且超时后日志里只有AAAA的反馈,而没有A的反馈,即使指定-type=A也一样
2024-11-12 10:36:38.563:DEBUG:smartdns::app:363: Request: 11 src:udp://192.168.88.246#55040 type:QUERY dnssec:false QUERY:www.msn.com.:AAAA:IN qflags:RD
2024-11-12 10:36:38.563:DEBUG:smartdns::app:374: Response: ; header 0:RESPONSE::NoError:QUERY:1/0/0
; query
;; www.msn.com. IN AAAA
; answers 1
www.msn.com. 300 IN SOA a.gtld-servers.net nstld.verisign-grs.com 1800 1800 900 604800 86400
0722

以上是在没有给“server 带域名的,都给它指定一个 ip”情况下观察的。

@mokeyish
Copy link
Owner

我做了一个 DNS 性能测试的仓库。

https://github.com/mokeyish/dnsperf-testing

对本仓库依赖 hickory 进行测试,发现大量50%超时,然后老外测试,居然不能重现。你们看看,你们能重现么?依赖的测试工具 windows 可能没有,得装 WSL 。

下面是提交的 issue ,有说测试步骤。思路是直接对 119.29.29.29 进行测试,与通过 hickory 做转发 119.29.29.29 对比。

hickory-dns/hickory-dns#2613

@mokeyish
Copy link
Owner

@xshzr 你说的规律,相当于控缓存了。现在发现可能是底层库 hickory 的问题,得先解决 hickory 超时问题,因为 smartdns-rs 依赖了它的 proto 协议库。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants