docker-compose-wsl2-and-linux-nvidia-gpu-ai.yml #7

coresolutiondoteu · 2024-09-23T07:25:34Z

Description

GPU Enabled AI Assistant Container for Linux and Windows OS with Docker under WSL2 (for Windows).

Explanation:

For this to work, you will need Linux and/or Windows with WSL2 (updated),
both OS must be equiped with the dedicated GPU which is usable for LLM
(with CUDA cores).

If this is the case, then ollama service will start with the GPU and AI
assistant will be super fast in responses compared to the CPU only.

Notebook users, be careful about the nVIDIA Optimus, that can break things.
If you have Optimus enabled notebook, just switch to dedicated GPU (sometimes
it may be required to do that in the BIOS directly).

Updated drivers must be installed.

Functional check can be done using: https://docs.docker.com/desktop/gpu/

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

More details (Linux): https://ollama.com/blog/ollama-is-now-available-as-an-official-docker-image

Type of change

New feature (non-breaking change which adds functionality)
GPU Enabled AI Assistant Container for Linux and Windows with Docker under WSL2.

How Has This Been Tested?

Several times tested with many different scenarios. Optimus enabled notebook issue described in explanation.

coresolutiondoteu · 2024-10-06T09:25:34Z

One thing that is still unclear for me is that after some time the GPU free up the resources, which means that AI becomes for a while unresponsive, until the GPU memory is again loaded with the data. That is pretty intense process for the main CPU... I will try to look for some docker.d conf/WSL conf that would solve this problem. And test again also Optimus as it could be wrong assumption from my side, as this has happened with Optimus as well. More testing will be done and then I will update the PR. Linux can't be tested due to time constraints.

HappyStoic · 2024-10-28T11:50:11Z

Hello @coresolutiondoteu, thank you for the PR, we appreciate it a lot! Sorry it took me so long to reply.

Currently one of our goals/constraints is to have the lab working out of the box on most machines:

Windows/macOS/Linux
with or without GPU
any proc architecture

Even though we would like to make the AI assistant faster, this change could break the lab on several machines, right?

Do you think you could instead make a section in the repository's discussions page explaining for others how to modify the docker-compose file to enable GPU acceleration? We could then incorporate this text to our wiki page. Thank you!

coresolutiondoteu closed this Jan 9, 2025

coresolutiondoteu force-pushed the main branch from 5d362a0 to 8b9975a Compare January 9, 2025 09:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker-compose-wsl2-and-linux-nvidia-gpu-ai.yml #7

docker-compose-wsl2-and-linux-nvidia-gpu-ai.yml #7

coresolutiondoteu commented Sep 23, 2024 •

edited

Loading

coresolutiondoteu commented Oct 6, 2024 •

edited

Loading

HappyStoic commented Oct 28, 2024

docker-compose-wsl2-and-linux-nvidia-gpu-ai.yml #7

docker-compose-wsl2-and-linux-nvidia-gpu-ai.yml #7

Conversation

coresolutiondoteu commented Sep 23, 2024 • edited Loading

Description

Explanation:

Type of change

How Has This Been Tested?

coresolutiondoteu commented Oct 6, 2024 • edited Loading

HappyStoic commented Oct 28, 2024

coresolutiondoteu commented Sep 23, 2024 •

edited

Loading

coresolutiondoteu commented Oct 6, 2024 •

edited

Loading