ROPE failing for llama 3.1 GGUF using llamacpp python versions >0.2.84 #1713

shaunck96 · 2024-08-28T01:12:19Z

shaunck96
Aug 28, 2024

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+""" . llamacpp python version >0.2.85 to leverage llama 3.1. Llama quant version initialization: """ llm = Llama.from_pretrained(
repo_id="bartowski/Meta-Llama-3.1-8B-Instruct-GGUF",
filename="Meta-Llama-3.1-8B-Instruct-Q4_K_L.gguf",
n_ctx=4096,
n_gpu_layers=-1
)
""". """ ggml_cuda_compute_forward: ROPE failed
CUDA error: unspecified launch failure
current device: 0, in function ggml_cuda_compute_forward at /home/runner/work/llama-cpp-python/llama-cpp-python/vendor/llama.cpp/ggml/src/ggml-cuda.cu:2313
err
/home/runner/work/llama-cpp-python/llama-cpp-python/vendor/llama.cpp/ggml/src/ggml-cuda/template-instances/../mmq.cuh:2589: ERROR: CUDA kernel mul_mat_q has no device code compatible with CUDA arch 700. ggml-cuda.cu was compiled for: 500,520,530,600,610,620,700,720,750,800,860,870,890,900
/home/runner/work/llama-cpp-python/llama-cpp-python/vendor/llama.cpp/ggml/src/ggml-cuda/template-instances/../mmq.cuh:2589: ERROR: CUDA kernel mul_mat_q has no device code compatible with CUDA arch 700. ggml-cuda.cu was compiled for: 500,520,530,600,610,620,700,720,750,800,860,870,890,900""". ROPE is failing. For large token count inputs, the above error occurs and inferencing occurs on CPU only. Please advice on resolution

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROPE failing for llama 3.1 GGUF using llamacpp python versions >0.2.84 #1713

{{title}}

Replies: 0 comments

Select a reply

ROPE failing for llama 3.1 GGUF using llamacpp python versions >0.2.84 #1713

shaunck96 Aug 28, 2024

Replies: 0 comments

shaunck96
Aug 28, 2024