设置成cpu时依然调用了显卡。
#1804
Replies: 1 comment 2 replies
-
不要使用auto,直接指定cpu,同时还需要禁用vllm |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
config里面已经设置:
Embedding 模型运行设备。设为"auto"会自动检测,也可手动设定为"cuda","mps","cpu"其中之一。
EMBEDDING_DEVICE = "cpu" #"auto"
LLM 名称
LLM_MODEL = "internlm-chat-7b"
LLM 运行设备。设为"auto"会自动检测,也可手动设定为"cuda","mps","cpu"其中之一。
LLM_DEVICE = "cpu" #"auto"
同时startup.py中也修改了:
sys.modules["fastchat.serve.vllm_worker"].worker = worker
但显卡依然被启动了并且占用了20G显存。切换chatGLM6b-2后遇到了一样的问题。
| 1 NVIDIA GeForce ... Off | 00000000:1B:00.0 Off | N/A |
| 30% 30C P8 19W / 350W | 21736MiB / 24268MiB | 0% Default |
| | | N/A
有办法禁用显卡测试纯CPU的表现吗。
Beta Was this translation helpful? Give feedback.
All reactions