Skip to content
Vladimir Mandic edited this page Nov 15, 2024 · 4 revisions

Docker

SD.Next includes basic Dockerfile for use with nVidia GPU equipped systems
Other system may require different configurations and base images, but principle remains

Goal of containerized SD.Next is to provide a fully stateless environment that can be easily deployed and scaled

SD.Next docker template is based on official base image with torch==2.5.1 with cuda==12.4

SD.Next docker image is currently not published in docker hub or any other repository since typically each user or organization will have their own customizations and requirements and build process is very simple and fast

Prerequisites

Important

If you already have functional Docker on your host, you can skip this section
For manualy steps see appendix at the end of the document

Build Image

Note

Building SDNext docker image is normally only required once and takes ~1min to complete
First build will also need to download the base image, which can take a while depending on your connection
If you make changes to Dockerfile or update SD.Next, you will need to rebuild the image

docker build \
  --debug \
  --tag sdnext/image \
  <path_to_sdnext_folder>

docker image inspect sdnext/image  
  Building 68.9s (10/10) FINISHED                                                             docker:default
  [internal] load build definition from Dockerfile                                                   0.0s
  transferring dockerfile: 312B                                                                      0.0s
  [internal] load metadata for docker.io/pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime               0.0s
  [internal] load .dockerignore                                                                      0.0s
  transferring context: 350B                                                                         0.0s
  [internal] load build context                                                                      1.5s
  transferring context: 420.53MB                                                                     1.5s
  CACHED [1/5] FROM docker.io/pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime                          0.0s
  [2/5] COPY . .                                                                                     0.8s
  [3/5] RUN apt-get -y update                                                                        3.9s
  [4/5] RUN apt-get -y install git                                                                   5.7s
  [5/5] RUN python launch.py --debug --uv --use-cuda --test                                         54.5s
  exporting to image                                                                                 2.4s
  exporting layers                                                                                   2.4s
  writing image sha256:4cd91d5f317b6851e04e89d4e1de1c6571ee5285b7e15c9e064f78d58779b7d5              0.0s
  naming to docker.io/sdnext/dev                                                                     0.0s

Base image pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime is 6.14GB
And full SD.Next resulting image is ~8.8GB and contains all required dependencies

Warning

If you have build errors, run with --progress=plain to get full build log

Run Container

Note

  • Republishes port from container to host directly
    You may need to remap ports if you have multiple containers running on the same host
  • Maps local server folder /server/data to be used by the container as data root
    This is where all state items and outputs will be read from and written to
  • Maps local server folder /server/models to be used by the container as model root
    This is where models will be read from and written to
docker run \
  --name sdnext-container \
  --rm \
  --gpus all \
  --publish 7860:7860 \
  --mount type=bind,source=/server/models,target=/mnt/models \
  --mount type=bind,source=/server/data,target=/mnt/data \
  --detach \
  sdnext/image

Typical SDNext container will start in ~10sec and will be ready to accept connections on port 7860

State

As mentioned, the goal of SD.Next docker deployment is fully stateless operations.
By default, SD.Next docker containers is stateless: any data stored inside the container is lost when the container stops.

All state items and outputs will be read from and written to /server/data
This includes:

  • Configuration files: config.json, ui-config.json
  • Cache information: cache.json, metadata.json
  • Outputs of all generated images: outputs/

Persistence

If you plan to customize SD.Next deployment with additional extensions,
you may want to create and map docker volume to avoid constaint reinstalls on each startup.

Extra

Additional docker commands that may be useful

Tip

Clean Up

docker image ls --all
docker image rm <id>
docker builder prune --force  

Tip

List Containers/Images

docker image ls --all
docker container ls --all
docker ps --all

Tip

View Log

> docker container logs --follow <id>

Tip

Stop Container

> docker container stop <id>

Tip

Test GPU

docker info  
docker run --name cudatest --rm --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark  

Tip

Test Torch

docker pull pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime  
docker run --name pytorch --rm --gpus all -it pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime  

Manual Install

Docker

wget https://download.docker.com/linux/ubuntu/dists/noble/pool/stable/amd64/containerd.io_1.7.23-1_amd64.deb
wget https://download.docker.com/linux/ubuntu/dists/noble/pool/stable/amd64/docker-ce_27.3.1-1~ubuntu.24.04~noble_amd64.deb
wget https://download.docker.com/linux/ubuntu/dists/noble/pool/stable/amd64/docker-ce-cli_27.3.1-1~ubuntu.24.04~noble_amd64.deb
wget https://download.docker.com/linux/ubuntu/dists/noble/pool/stable/amd64/docker-buildx-plugin_0.17.1-1~ubuntu.24.04~noble_amd64.deb
wget https://download.docker.com/linux/ubuntu/dists/noble/pool/stable/amd64/docker-compose-plugin_2.29.7-1~ubuntu.24.04~noble_amd64.deb
sudo dpkg -i *.deb

sudo groupadd docker
sudo usermod -aG docker $USER
systemctl status docker
systemctl status containerd

nVidia Container ToolKit

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Clone this wiki locally