Do you know anything about "CUDA Error: all CUDA-capable devices are busy or unavailable" error? #9150

emremrah · 2024-08-23T13:48:49Z

emremrah
Aug 23, 2024

I'm running llama.cpp server in a docker compose like the following service. It works on Nvidia 2080 Ti. However it fails when I run the exact service on a Nvidia A16 GPU. I get the error: "CUDA Error: all CUDA-capable devices are busy or unavailable". It works without GPU (llama.cpp:server). Do you have any ideas what could have cause this? The only working process is xorg, which is the GUI of the Ubuntu Desktop I think.

  llm:
    image: ghcr.io/ggerganov/llama.cpp:server-cuda
    volumes:
      - ./models:/models
    command: >-
      -m models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf -c 4096  --host 0.0.0.0  --port 8080  --n-gpu-layers 128
    networks:
      - resume-scoring
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [ gpu ]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do you know anything about "CUDA Error: all CUDA-capable devices are busy or unavailable" error? #9150

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Do you know anything about "CUDA Error: all CUDA-capable devices are busy or unavailable" error? #9150

emremrah Aug 23, 2024

Replies: 0 comments

emremrah
Aug 23, 2024