[QUESTION]咨询一下twitter应该怎么下载 #217

LRTFK · 2024-12-18T15:48:02Z

问题描述
下载twitter视频的时候，出现了状态码: 403。希望知道正确的获取cookies方式和使用方式。

截图/代码示例
INFO 应用：twitter
INFO 模式：post
INFO 主配置路径：D:\Temp\f2\f2\conf\app.yaml
INFO 自定义配置路径：D:\Temp\f2\twi.yaml
WARNING 第 1 次请求响应内容为空, 状态码: 403,
URL:https://x.com/i/api/graphql/laYnJPCAcVo0o6pzcnlVxQ/UserByScreenName?xxxxxxxxxxxx太长就删了
WARNING 第 2 次请求响应内容为空, 状态码: 403,
URL:https://x.com/i/api/graphql/laYnJPCAcVo0o6pzcnlVxQ/UserByScreenName?xxxxxxxxxxxx太长就删了
WARNING 第 3 次请求响应内容为空, 状态码: 403,
URL:https://x.com/i/api/graphql/laYnJPCAcVo0o6pzcnlVxQ/UserByScreenName?xxxxxxxxxxxx太长就删了
WARNING 第 4 次请求响应内容为空, 状态码: 403,
URL:https://x.com/i/api/graphql/laYnJPCAcVo0o6pzcnlVxQ/UserByScreenName?xxxxxxxxxxxx太长就删了

dosubot · 2024-12-18T15:49:58Z

@LRTFK 你好！我是Dosu，我是一个能够帮助你解决bug、回答问题，甚至成为贡献者的机器人。在等待人类维护者的时候，我可以帮到你。

要正确获取和使用 cookies 下载 Twitter 视频，您需要提供有效的 cookie 参数。如果没有提供 cookie，程序会抛出 ValueError，提示 cookie 不能为空，并建议使用 --auto-cookie 选项从浏览器自动获取 cookie。例如，您可以使用命令 --auto-cookie edge 来从 Edge 浏览器获取 cookie。

--auto-cookie 选项会自动从指定的浏览器（如 Edge、Chrome、Firefox 等）中检索 cookies。请确保关闭所有打开的浏览器实例，并且您有权限访问浏览器。

以下是相关代码片段，说明如何处理自动获取 cookie 的逻辑：

def handler_auto_cookie(
    ctx: click.Context,
    param: typing.Union[click.Option, click.Parameter],
    value: typing.Any,
) -> None:
    if not value or ctx.resilient_parsing or ctx.params.get("cookie"):
        return

    try:
        cookie_value = split_dict_cookie(get_cookie_from_browser(value, "twitter.com"))

        if not cookie_value:
            raise ValueError(_("无法从 {0} 浏览器中获取cookie").format(value))

        manager = ConfigManager(
            ctx.params.get("config", get_resource_path(f2.APP_CONFIG_FILE_PATH))
        )
        manager.update_config_with_args("twitter", cookie=cookie_value)
    except PermissionError:
        logger.error(_("请关闭所有已打开的浏览器重试，并且你有适当的权限访问浏览器！"))
        ctx.abort()
    except Exception as e:
        logger.error(_("自动获取Cookie失败：{0}").format(str(e)))
        ctx.abort()
    finally:
        ctx.exit(0)

确保在初始化 TwitterDownloader 类时提供 cookie 参数，可以手动提供或使用 --auto-cookie 选项^[1]^[2]^[3]。

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

Johnserf-Seed · 2024-12-18T15:50:28Z

需要X-Csrf-Token，在网页版开发者控制台中获取

LRTFK · 2024-12-18T16:19:38Z

需要X-Csrf-Token，在网页版开发者控制台中获取

感谢回复，但是我在yaml中更新了X-Csrf-Token值，运行之后依然不好用，还是403。
我的cookie使用开发者控制台中获取的这个格式的，运行直接出现403报错。
lang=zh-cn; guest_id=xxxxx; night_mode=2; guest_id_marketing=v1%xxxxx; guest_id_ads=v1%xxxxx; kdt=xxxxx; auth_token=xxxxx; ct0=xxxxx; lang=zh-cn; twid=u%xxxxx; _twitter_sess=xxxxx%xxxxx%xxxxx%xxxxx%253D--xxxxx; personalization_id="xxxxx=="
从开发者控制台application中的cookie里面拿_twitter_sess的value放在yaml中的时候，可以进入多次尝试的循环，结果还是403。
所以还是不知道应该用哪个值来使用能运行。

Johnserf-Seed · 2024-12-18T16:29:29Z

我测试了是正常的，X-Csrf-Token和ck照常填写，原因是你的网络问题，你可以排查一下代理

LRTFK · 2024-12-18T16:45:26Z

感谢回复，回头我再试试吧。我刚刚403的时候其实已经尝试换了一个代理了。。。。

Johnserf-Seed · 2024-12-18T16:47:59Z

感谢回复，回头我再试试吧。我刚刚403的时候其实已经尝试换了一个代理了。。。。

我不知道是不是你代理配置的问题，要考虑出口代理支不支持https，后续更新会支持socks5代理

LRTFK · 2024-12-18T16:56:25Z

我直接用的clash的。我在clash切完代理，断开之前的连接重新chrome打开Twitter，之后再使用脚本，还是403。
proxies:
http://: "http://127.0.0.1:7890"
所以不知道应该怎么操作了

Johnserf-Seed · 2024-12-18T17:01:10Z

我直接用的clash的。我在clash切完代理，断开之前的连接重新chrome打开Twitter，之后再使用脚本，还是403。 proxies: http://: "http://127.0.0.1:7890" 所以不知道应该怎么操作了

那还是你的ck，你不需要从应用程序里拿，你在网络里的xhr里找相关接口然后复制请求标头里的ck和X-Csrf-Token到配置文件里

LRTFK · 2024-12-18T17:13:10Z

网络里的xhr里找相关接口，找了一个https://api.x.com/1.1/live_pipeline/update_subscriptions的，然后复制请求标头里的cookie和X-Csrf-Token到配置文件里，运行1次就直接报错退出了。
INFO 应用：twitter
INFO 模式：post
INFO 主配置路径：D:\Temp\f2\f2\conf\app.yaml
INFO 自定义配置路径：D:\Temp\f2\twi.yaml
ERROR HTTP状态错误：Client error '403 Forbidden' for url
'https://x.com/i/api/graphql/laYnJPCAcVo0o6pzcnlVxQ/UserByScreenName?variables=xxxxxx
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403,
, 尝试次数：1
ERROR 请前往QA文档 https://johnserf-seed.github.io/f2/question-answer/qa.html 查看相关帮助

Traceback (most recent call last):
File "D:\Temp\f2\f2\crawlers\base_crawler.py", line 229, in get_fetch_data
response.raise_for_status()
File "d:\Temp\f2.venv\Lib\site-packages\httpx_models.py", line 763, in raise_for_status
raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '403 Forbidden' for url 'https://x.com/i/api/graphql/laYnJPCAcVo0o6pzcnlVxQ/UserByScreenName?variables=xxxx
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "D:\Temp\f2.venv\Scripts\f2.exe_main.py", line 7, in
File "d:\Temp\f2.venv\Lib\site-packages\click\core.py", line 1157, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\Temp\f2.venv\Lib\site-packages\click\core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "d:\Temp\f2.venv\Lib\site-packages\click\core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\Temp\f2.venv\Lib\site-packages\click\core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\Temp\f2.venv\Lib\site-packages\click\core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\Temp\f2.venv\Lib\site-packages\click\decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Temp\f2\f2\apps\twitter\cli.py", line 391, in twitter
ctx.invoke(set_cli_config, **kwargs)
File "d:\Temp\f2.venv\Lib\site-packages\click\core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\Temp\f2.venv\Lib\site-packages\click\decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Temp\f2\f2\cli\cli_commands.py", line 169, in set_cli_config
asyncio.run(run_app(kwargs))
File "C:\Users\zhangsan\scoop\apps\python311\current\Lib\asyncio\runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "C:\Users\zhangsan\scoop\apps\python311\current\Lib\asyncio\runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\zhangsan\scoop\apps\python311\current\Lib\asyncio\base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "D:\Temp\f2\f2\cli\cli_commands.py", line 175, in run_app
await app_module.main(kwargs)
File "D:\Temp\f2\f2\apps\twitter\handler.py", line 465, in main
await mode_function_mapmode
File "D:\Temp\f2\f2\apps\twitter\handler.py", line 184, in handle_post_tweet
user = await self.fetch_user_profile(uniqueID)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Temp\f2\f2\apps\twitter\handler.py", line 62, in fetch_user_profile
response = await crawler.fetch_user_profile(params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Temp\f2\f2\apps\twitter\crawler.py", line 60, in fetch_user_profile
return await self._fetch_get_json(endpoint)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Temp\f2\f2\crawlers\base_crawler.py", line 150, in fetch_get_json
response = await self.get_fetch_data(endpoint)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Temp\f2\f2\crawlers\base_crawler.py", line 286, in get_fetch_data
self.handle_http_status_error(exc, url, attempt + 1)
File "D:\Temp\f2\f2\crawlers\base_crawler.py", line 415, in handle_http_status_error
raise APIResponseError(("HTTP状态码错误："), status_code)
f2.exceptions.api_exceptions.APIResponseError: HTTP状态码错误： Status Code: 403

Johnserf-Seed · 2024-12-18T17:20:24Z

所以问题还是你的ck，我是觉得是你的ck格式错了，注意有双引号，所以如果你要放脚本里执行就需要单引号包裹整个ck

LRTFK · 2024-12-18T17:25:45Z

我的cookie是放在yaml里面的，cookie和X-Csrf-Token都再yaml里面，位置应该无所谓吧。
twitter:
cookie: guest_id=123455xxxx; night_mode=2; guest_id_marketing=v1%123455xxxx; guest_id_ads=v1%123455xxxx; gt=123455xxxx; kdt=123455xxxx; auth_token=123455xxxx; ct0=123455xxxx; twid=u%123455xxxx; att=1-123455xxxx; personalization_id="v1_RB60dKTB/123455xxxx=="
X-Csrf-Token: "123455xxxx"
双引号我看到了。在yaml里面整个cookie加不加单引号都会报错。
今天太晚了，明天我也用postman试试，cookie+X-Csrf-Token或者代理，其他的应该没毛病

Johnserf-Seed · 2024-12-18T17:27:30Z

需要X-Csrf-Token，在网页版开发者控制台中获取

你X-Csrf-Token放错地方了，是F2配置文件conf.yaml。不是app.yaml或自定义配置文件里

Johnserf-Seed · 2024-12-18T17:28:24Z

LRTFK · 2024-12-18T17:43:51Z

哇，终于找到原因了。不过你这个需要加 X-Csrf-Token的截图，我在文档里面没找到。
改动conf之后能开始下载了，好使了一点点，然后出现了新的问题了
INFO [ 完成 ]： 2024-12-08 13-45-22_Love_life__love_freedom____desc.txt
INFO [ 完成 ]： 2024-12-03 13-46-24_President_Biden_pardoned_his_son_and_it_s_really_hilarious___desc.txt
INFO [ 完成 ]： 2024-12-01 13-31-56_Can_anyone_tell_me_what_Trump_s_tariff_policy_will_do_to_the_United_States__desc.txt
ERROR httpx 请求错误：https://pbs.twimg.com/media/Gdt3queXcAAgtDW.jpg?format=jpg&name=large，错误详情：
WARNING 链接 https://pbs.twimg.com/media/Gdt3queXcAAgtDW.jpg?format=jpg&name=large 内容长度为0，尝试下一个链接是否可用
WARNING 所有链接都无法下载
ERROR [ 丢失 ]：无法下载文件：2024-12-01

Johnserf-Seed · 2024-12-18T17:45:37Z

哇，终于找到原因了。不过你这个需要加 X-Csrf-Token的截图，我在文档里面没找到。改动conf之后能开始下载了，好使了一点点，然后出现了新的问题了 INFO [ 完成 ]： 2024-12-08 13-45-22_Love_life__love_freedom____desc.txt INFO [ 完成 ]： 2024-12-03 13-46-24_President_Biden_pardoned_his_son_and_it_s_really_hilarious___desc.txt INFO [ 完成 ]： 2024-12-01 13-31-56_Can_anyone_tell_me_what_Trump_s_tariff_policy_will_do_to_the_United_States__desc.txt ERROR httpx 请求错误：https://pbs.twimg.com/media/Gdt3queXcAAgtDW.jpg?format=jpg&name=large，错误详情： WARNING 链接 https://pbs.twimg.com/media/Gdt3queXcAAgtDW.jpg?format=jpg&name=large 内容长度为0，尝试下一个链接是否可用 WARNING 所有链接都无法下载 ERROR [ 丢失 ]：无法下载文件：2024-12-01

因为文档还在更新，没有合并到主分支。还是不能下载的情况还是需要你自行来排查原因，因为测试全是通过的

LRTFK · 2024-12-18T17:47:08Z

多谢了

LRTFK added the 提问(question) 想得到更多的详细支持(Further information is requested) label Dec 18, 2024

LRTFK closed this as completed Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION]咨询一下twitter应该怎么下载 #217

[QUESTION]咨询一下twitter应该怎么下载 #217

LRTFK commented Dec 18, 2024 •

edited

Loading

dosubot bot commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024 •

edited

Loading

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024 •

edited

Loading

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

[QUESTION]咨询一下twitter应该怎么下载 #217

[QUESTION]咨询一下twitter应该怎么下载 #217

Comments

LRTFK commented Dec 18, 2024 • edited Loading

dosubot bot commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024 • edited Loading

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024 • edited Loading

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

Johnserf-Seed commented Dec 18, 2024

LRTFK commented Dec 18, 2024

LRTFK commented Dec 18, 2024 •

edited

Loading

LRTFK commented Dec 18, 2024 •

edited

Loading

Johnserf-Seed commented Dec 18, 2024 •

edited

Loading