This repository provides the implementation of StableDiffusionXLControlNetInpaintPipeline
and StableDiffusionXLControlNetImg2ImgPipeline
. These pipelines are not officially implemented in diffusers yet, but enable more accurate image generation/editing processes.
SDXL + Inpainting + ControlNet pipeline
Sample codes are below:
# for depth conditioned controlnet
python test_controlnet_inpaint_sd_xl_depth.py
# for canny image conditioned controlnet
python test_controlnet_inpaint_sd_xl_canny.py
Of course, you can also use the ControlNet provided by SDXL, such as normal map, openpose, etc.
In test_controlnet_inpaint_sd_xl_depth.py
, StableDiffusionXLControlNetInpaintPipeline
is used as follows.
All you have to do is to specify control_image
and mask_image
as conditions.
# construct pipeline
import torch
from diffusers import ControlNetModel, AutoencoderKL, UniPCMultistepScheduler
from pipeline_controlnet_inpaint_sd_xl import StableDiffusionXLControlNetInpaintPipeline
controlnet = ControlNetModel.from_pretrained(
"diffusers/controlnet-depth-sdxl-1.0",
variant="fp16",
use_safetensors=True,
torch_dtype=torch.float16,
).to("cuda")
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16).to("cuda")
pipe = StableDiffusionXLControlNetInpaintPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
variant="fp16",
use_safetensors=True,
torch_dtype=torch.float16,
).to("cuda")
pipe.enable_model_cpu_offload()
...
# image generation conditioned with control_image & mask_image
images = pipe(
prompt, image=init_image, control_image=depth_image, mask_image=mask_image, num_inference_steps=30, controlnet_conditioning_scale=controlnet_conditioning_scale,
).images
images[0].save(f"dogstatue.png")
SDXL + Img2Img + ControlNet pipeline
Sample codes:
# for depth conditioned controlnet
python test_controlnet_img2img_sd_xl_depth.py
# for canny image conditioned controlnet
python test_controlnet_img2img_sd_xl_canny.py
Specific usage is as follows:
# construct pipeline
import torch
from diffusers import ControlNetModel, AutoencoderKL, UniPCMultistepScheduler
from pipeline_controlnet_img2img_sd_xl import StableDiffusionXLControlNetImg2ImgPipeline
controlnet = ControlNetModel.from_pretrained(
"diffusers/controlnet-depth-sdxl-1.0",
torch_dtype=torch.float16
)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetImg2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
torch_dtype=torch.float16,
).to("cuda")
pipe.enable_model_cpu_offload()
...
# image generation conditioned with control_image & init_image
image = pipe(
"futuristic-looking woman",
num_inference_steps=20,
generator=generator,
image=init_image,
control_image=depth_image,
).images[0]
...