Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Multi-GPU support for inferencing #371

Open
philpax opened this issue Jul 16, 2023 · 1 comment
Open

Multi-GPU support for inferencing #371

philpax opened this issue Jul 16, 2023 · 1 comment
Labels
issue:enhancement New feature or request topic:clblast https://github.com/CNugteren/CLBlast support topic:cublas https://developer.nvidia.com/cublas support

Comments

@philpax
Copy link
Collaborator

philpax commented Jul 16, 2023

With #325, we now have GPU acceleration. However, this is limited to one GPU at present. We'll need to mimic the functionality from llama.cpp in order to distribute tensors between models as appropriate.

@philpax philpax added issue:enhancement New feature or request topic:cublas https://developer.nvidia.com/cublas support topic:clblast https://github.com/CNugteren/CLBlast support labels Jul 16, 2023
@jacohend
Copy link

jacohend commented Sep 3, 2023

Possibly addressed in #419 ?

Still getting some strange assert failures on llama.cpp's end when I try to set cuda device 1 instead of 0.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue:enhancement New feature or request topic:clblast https://github.com/CNugteren/CLBlast support topic:cublas https://developer.nvidia.com/cublas support
Projects
None yet
Development

No branches or pull requests

2 participants