Shared memory usage #711

dancixx · 2024-08-24T10:22:32Z

dancixx
Aug 24, 2024

I tried to use Phi3v on a Mac, and it worked slowly but without CUDA. However, if I switched to CUDA RTX 4070 12GB, I always got an "out of memory" error. I tried the same with the Huggingface transformers python lib using a larger model and it seems that it can handle loading the model partially into the GPU and Memory. I am still not as experienced in LLMs as I would, but maybe I should add some extra settings to the example code, or mistral.rs cannot support this right now?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shared memory usage #711

{{title}}

Replies: 0 comments

Select a reply

Shared memory usage #711

dancixx Aug 24, 2024

Replies: 0 comments

dancixx
Aug 24, 2024