OpenBLAS and CUBLAS #1574

aneeshjoy · 2023-05-23T16:38:32Z

aneeshjoy
May 23, 2023

I have just have 6GB NVIDIA GPU. So most of the time I will be offloading some of the model layers to GPU.

Does it make sense to compile with both LLAMA_OPENBLAS=1 and LLAMA_CUBLAS=1 enabled?

Will that give any overall performance improvement?

SlyEcho · 2023-05-26T07:44:47Z

If a layer is not loaded to the GPU, it will still use cuBLAS, only that it needs to copy the data to the device before calculation.

1 reply

Got it. Thanks