-
-
Notifications
You must be signed in to change notification settings - Fork 454
ONNX Runtime
SD.Next includes support for ONNX Runtime.
Currently, we can't use --use-directml
because there's no release of torch-directml
built with latest PyTorch. (this does not mean that you can't use DmlExecutionProvider)
Change Execution backend
to diffusers
and Diffusers pipeline
to ONNX Stable Diffusion
on the System
tab.
The performance depends on the execution provider.
Currently, CUDAExecutionProvider
and DmlExecutionProvider
are supported.
ONNX | Olive | GPU | CPU | |
---|---|---|---|---|
CPUExecutionProvider | ✅ | ❌ | ❌ | ✅ |
DmlExecutionProvider | ✅ | ✅ | ✅ | ❌ |
CUDAExecutionProvider | ✅ | ✅ | ✅ | ❌ |
ROCMExecutionProvider | ✅ | 🚧 | ✅ | ❌ |
OpenVINOExecutionProvider | ✅ | ✅ | ✅ | ✅ |
Not recommended.
Enabled by default.
You can select DmlExecutionProvider
by installing onnxruntime-directml
.
DirectX 12 API is required. (Windows or WSL)
You can select CUDAExecutionProvider
by installing onnxruntime-gpu
. (may have been automatically installed)
Olive for ROCm is working in progress.
Under development.
- Models from huggingface
- Hires and second pass (without sdxl refiner)
- .safetensors VAE
- SD Inpaint may not work.
- SD Upscale pipeline is not tested.
- SDXL Refiner does not work. (due to onnxruntime's issue)
I'm getting OnnxStableDiffusionPipeline.__init__() missing 4 required positional arguments: 'vae_encoder', 'vae_decoder', 'text_encoder', and 'unet'
.
It's due to the broken model cache which was previously generated by failed conversion or Olive run. Find one in models/ONNX/cache
and remove it. You can also use ONNX
tab on UI. (You should enable it on settings to make it show up)
Olive is an easy-to-use hardware-aware model optimization tool that composes industry-leading techniques across model compression, optimization, and compilation. (from pypi)
As Olive optimizes the models in ONNX format, you should set up ONNX Runtime first.
- Go to
System
tab →Compute Settings
. - Select
Model
,Text Encoder
andVAE
inCompile Model
. - Set
Model compile backend
toolive-ai
.
Olive-specific settings are under Olive
in Compute Settings
.
Run these commands using PowerShell.
.\venv\Scripts\activate
pip uninstall torch-directml
pip install torch torchvision --upgrade
pip install onnxruntime-directml
.\webui.bat
Model optimization occurs automatically before generation.
Target models can be .safetensors, .ckpt, Diffusers pretrained model and the optimization progress takes time depending on your system and execution provider.
The optimized models are automatically cached and used later to create images of the same size (height and width).
If your system memory is not enough to optimize model or you don't want to waste your time to optimize the model yourself, you can download optimized model from Huggingface.
Go to Models
→ Huggingface
tab and download optimized model.
TBA
Property | Value |
---|---|
Prompt | a castle, best quality |
Negative Prompt | worst quality |
Sampler | Euler |
Sampling Steps | 20 |
Device | RX 7900 XTX 24GB |
Version | olive-ai(0.4.0) onnxruntime-directml(1.16.3) ROCm(5.6) torch(olive: 2.1.2, rocm: 2.1.0) |
Model | runwayml/stable-diffusion-v1-5 (ROCm), lshqqytiger/stable-diffusion-v1-5-olive (Olive) |
Precision | fp16 |
Token Merging | Olive(0, not supported) ROCm(0.5) |
Olive with DmlExecutionProvider | ROCm |
---|---|
- The generation is faster.
- Uses less graphics memory.
- Optimization is required for every models and image sizes.
- Some features are unavailable.
After activating python venv, run this command and try again:
(venv) $ pip uninstall onnxruntime onnxruntime-... -y