DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing

Yujun Shi Chuhui Xue Jiachun Pan Wenqing Zhang Vincent Y. F. Tan Song Bai

[arXiv paper] [Project Page]

Disclaimer

This is a research project, NOT a commercial product.

Installation

It is recommended to run our code on a Nvidia GPU with a linux system. We have not yet tested on other configurations.

To install the required libraries, simply run the following command:

conda env create -f environment.yaml
conda activate dragdiff

Run DragDiffusion

Before running DragDiffusion, you might need to set up "accelerate" with the following command:

accelerate config

In all our experiments, we used the following configuration for "accelerate":

Step 1: train a LoRA

To train a LoRA on our input image, we first put the image under a folder. Note that this folder should ONLY contain this one image.
Then, we set "SAMPLE_DIR" and "OUTPUT_DIR" in the script "lora/train_lora.sh" to be proper values. "SAMPLE_DIR" should be the directory containing our input image; "OUTPUT_DIR" should be the directory where we want to save the trained LoRA.
Also, we need to set the option "--instance_prompt" in the script "lora/train_lora.sh" to be a proper prompt. Note that this prompt does NOT have to be a complicated one. Examples of prompts (i.e., prompts used in our Demo video) are given in "lora/samples/prompts.txt".
Finally, After the "lora/train_lora.sh" file has been configured properly, run the following command to train a LoRA:

bash lora/train_lora.sh

Step 2: do "drag" editing

After training the LoRA, we can now run the following command to start the gradio user interface:

python3 drag_ui_real.py

Please refer to our Demo video to see how to do the "drag" editing.

The editing process is consist of the following steps:

Drop our input image into the left-most box.
Draw a mask in the left-most box to specify the editable areas.
Click handle and target points in the middle box. Also, you may reset all points by clicking "Undo point".
Input "prompt" and "lora path". "lora path" is the directory storing our trained LoRA; "prompt" should be the same prompt we used to train our LoRA.
Finally, click the "Run" button to run our algorithm. Edited results will be displayed in the right-most box.

Explanation for parameters in the user interface:

Parameter	Explanation
prompt	The prompt describing the user input image (This needs to be the same as the prompt used to train LoRA).
lora_path	The path to the trained LoRA
n_pix_step	Maximum number of steps of motion supervision. Increase this value if handle points have not been "dragged" to desired position.
lam	The regularization coefficient controlling unmasked region stays unchanged. Increase this value if the unmasked region has changed more than what was desired (do not have to tune in most cases).
n_actual_inference_step	Number of DDIM inversion steps performed (do not have to tune in most cases).

Acknowledgement

This work is inspired by the amazing DragGAN. The lora training code is modified from an example of diffusers. Image samples are collected from unsplash, pexels, pixabay. Finally, a huge shout-out to all the amazing open source diffusion models and libraries.

License

Code related to the DragDiffusion algorithm is under Apache 2.0 license.

BibTeX

@article{shi2023dragdiffusion,
  title={DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing},
  author={Shi, Yujun and Xue, Chuhui and Pan, Jiachun and Zhang, Wenqing and Tan, Vincent YF and Bai, Song},
  journal={arXiv preprint arXiv:2306.14435},
  year={2023}
}

TODO

Upload trained LoRAs of our examples
Support arbitrary size input
Integrate the lora training function into the user interface.
Try to use another user interface that can respond faster.

Contact

For any questions on this project, please contact Yujun ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
lora		lora
release-doc		release-doc
LICENSE		LICENSE
README.md		README.md
drag_pipeline.py		drag_pipeline.py
drag_ui_real.py		drag_ui_real.py
drag_utils.py		drag_utils.py
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing

Disclaimer

Installation

Run DragDiffusion

Step 1: train a LoRA

Step 2: do "drag" editing

Acknowledgement

License

BibTeX

TODO

Contact

About

Releases

Packages

Languages

License

HelloWarcraft/DragDiffusion

Folders and files

Latest commit

History

Repository files navigation

DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing

Disclaimer

Installation

Run DragDiffusion

Step 1: train a LoRA

Step 2: do "drag" editing

Acknowledgement

License

BibTeX

TODO

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages