Bug - Crash when trying to start offloading GGUF model to GPU with machine based on SSE4.2 CPU #10060

mlgitter · 2024-10-26T13:32:29Z

mlgitter
Oct 26, 2024

I've tried several LLM utils (like Kobold, Jan, etc) based on llama.cpp with recent versions on my dual 12 cores 24 threads 128Gb SSE4.2 CPU machine with GPUs.

It always crashed when I try to offload to GPUs and it seems everything is compiled the way that you either

can use slow CPU only inference with SSE3 or
can offload to GPU but only if have a AVX* sort of instructions supported.

I can't post it a bug category as logs usually uninformative but sometime state like a poor instruction set like SSE unsupported.

So, my questions are:

Is it right or one can offload to GPU with llama.cpp on a machine with SSE4.1|4.2 support?
What I need to do so? Do I need to recompile llama.cpp with SSE4.1|4.2 support?
Or do I need to add|rewrite some routines/functions in llama.cpp for SSE4.1|4.2 support?
If it is not by design do you plan to fix it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug - Crash when trying to start offloading GGUF model to GPU with machine based on SSE4.2 CPU #10060

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Bug - Crash when trying to start offloading GGUF model to GPU with machine based on SSE4.2 CPU #10060

mlgitter Oct 26, 2024

Replies: 0 comments

mlgitter
Oct 26, 2024