Multi-GPU support for inferencing #371

philpax · 2023-07-16T16:04:34Z

With #325, we now have GPU acceleration. However, this is limited to one GPU at present. We'll need to mimic the functionality from llama.cpp in order to distribute tensors between models as appropriate.

jacohend · 2023-09-03T19:30:32Z

Possibly addressed in #419 ?

Still getting some strange assert failures on llama.cpp's end when I try to set cuda device 1 instead of 0.

philpax added issue:enhancement New feature or request topic:cublas https://developer.nvidia.com/cublas support topic:clblast https://github.com/CNugteren/CLBlast support labels Jul 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-GPU support for inferencing #371

Multi-GPU support for inferencing #371

philpax commented Jul 16, 2023

jacohend commented Sep 3, 2023

Multi-GPU support for inferencing #371

Multi-GPU support for inferencing #371

Comments

philpax commented Jul 16, 2023

jacohend commented Sep 3, 2023