Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请教一下,glm-4v会支持vllm推理吗? #583

Open
2500035435 opened this issue Oct 12, 2024 · 4 comments
Open

请教一下,glm-4v会支持vllm推理吗? #583

2500035435 opened this issue Oct 12, 2024 · 4 comments
Assignees

Comments

@2500035435
Copy link

Feature request / 功能建议

glm-4v使用vllm推理。

Motivation / 动机

目前想尝试服务器部署,使用多客户端对glm4v的api server发起请求,但是用Transformers推理返回结果太慢。看到glm4-chat用vllm推理的速度快了不少,因此希望4v也能支持vllm推理。

Your contribution / 您的贡献

@elesun2018
Copy link

同问,sss

@sixsixcoder sixsixcoder self-assigned this Oct 14, 2024
@sixsixcoder
Copy link
Collaborator

sixsixcoder commented Oct 14, 2024

在最近的PR中已经将GLM-4v适配了vllm=0.6.2,相信很快就会合并。您可以部署最新版的vllm,并且参考PR 585readme中的示例进行推理

@neblen
Copy link

neblen commented Oct 16, 2024

@sixsixcoder 想问下支持使用glm 4v 9b int4进行推理嘛?

@sixsixcoder
Copy link
Collaborator

@sixsixcoder 想问下支持使用glm 4v 9b int4进行推理嘛?

暂不支持

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants