Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

友链抓取失败,run.py运行出错 #153

Open
Akimio521 opened this issue Jul 21, 2024 · 6 comments
Open

友链抓取失败,run.py运行出错 #153

Akimio521 opened this issue Jul 21, 2024 · 6 comments

Comments

@Akimio521
Copy link

使用Github抓取博客链接、使用mongodb存储数据,在抓取阶段出现问题
https://blog.akimio.top/links/是用的是butterfly魔改主题(solitude)[https://github.com/everfu/hexo-theme-solitude],之前是可以正常抓取的,一开始我怀疑是主题的问题,找了一个原版butterfly主题的友链,还是出现同样的报错
相关配置fc_setting.yaml:

LINK: [
     { link: "https://blog.akimio.top/links/", theme: "butterfly" },  # 友链页地址1,修改为你的友链页地址
#     { link: "https://noionion.top/link/", theme: "butterfly" }, # 友链页地址2
#     { link: "https://immmmm.com/about/", theme: "common1" }, # 友链页地址3
#     ...
]

报错日志:

2024-07-21 18:23:02|INFO|2009|hexo_circle_of_friends.pipelines.***_pipe|----------------------
2024-07-21 18:23:02|INFO|2009|hexo_circle_of_friends.pipelines.***_pipe|友链总数 : 0
2024-07-21 18:23:02|INFO|2009|hexo_circle_of_friends.pipelines.***_pipe|失联友链数 : 0
Unhandled error in Deferred:
2024-07-21 18:23:12 [twisted] CRITICAL: Unhandled error in Deferred:
2024-07-21 18:23:12|CRITICAL|2009|twisted|Unhandled error in Deferred:

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/scrapy/crawler.py", line 192, in crawl
    return self._crawl(crawler, *args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/scrapy/crawler.py", line 196, in _crawl
    d = crawler.crawl(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/twisted/internet/defer.py", line 1909, in unwindGenerator
    return _cancellableInlineCallbacks(gen)  # type: ignore[unreachable]
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/twisted/internet/defer.py", line 1816, in _cancellableInlineCallbacks
    _inlineCallbacks(None, gen, status)
--- <exception caught here> ---
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/twisted/internet/defer.py", line 1661, in _inlineCallbacks
    result = current_context.run(gen.send, result)
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/scrapy/crawler.py", line 89, in crawl
    yield self.engine.open_spider(self.spider, start_requests)
pymongo.errors.OperationFailure: bad auth : authentication failed, full error: {'ok': 0, 'errmsg': 'bad auth : authentication failed', 'code': 8000, 'codeName': 'AtlasError'}

2024-07-21 18:23:12|CRITICAL|2009|twisted|
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/twisted/internet/defer.py", line 1661, in _inlineCallbacks
    result = current_context.run(gen.send, result)
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/scrapy/crawler.py", line 89, in crawl
    yield self.engine.open_spider(self.spider, start_requests)
pymongo.errors.OperationFailure: bad auth : authentication failed, full error: {'ok': 0, 'errmsg': 'bad auth : authentication failed', 'code': 8000, 'codeName': 'AtlasError'}
2024-07-21 18:23:12 [twisted] CRITICAL: 
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/twisted/internet/defer.py", line 16[61](https://github.com/Akimio521/blog-frcircle/actions/runs/10027570858/job/27713413637#step:11:62), in _inlineCallbacks
    result = current_context.run(gen.send, result)
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/scrapy/crawler.py", line 89, in crawl
    yield self.engine.open_spider(self.spider, start_requests)
pymongo.errors.OperationFailure: bad auth : authentication failed, full error: {'ok': 0, 'errmsg': 'bad auth : authentication failed', 'code': 8000, 'codeName': 'AtlasError'}
@xyhcode
Copy link
Contributor

xyhcode commented Jul 22, 2024

用服务器部署就好了

@Akimio521
Copy link
Author

用服务器部署就好了

并没有特别适合的机器长期后台运行

@LanYunDev
Copy link
Contributor

最新commit已修复

@CCKNBC CCKNBC closed this as completed Jul 27, 2024
@CCKNBC CCKNBC reopened this Jul 27, 2024
@Akimio521
Copy link
Author

似乎并没有成功修复,抓取到的友联数依旧是0

@xyhcode
Copy link
Contributor

xyhcode commented Jul 27, 2024

似乎并没有成功修复,抓取到的友联数依旧是0

最新的提交 vercel不会抓取到0,只是访问的时候会500

@Akimio521
Copy link
Author

似乎并没有成功修复,抓取到的友联数依旧是0

最新的提交 vercel不会抓取到0,只是访问的时候会500

我同步仓库的main分支然后进行配置,使用的环境是GitHub Action + MongoDB,运行完成后MongoDB的数据依旧为空

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants