Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux.1 #1331

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Flux.1 #1331

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 49 additions & 6 deletions examples/stable-diffusion/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@ This directory contains a script that showcases how to perform text-to-image gen

Stable Diffusion was proposed in [Stable Diffusion Announcement](https://stability.ai/blog/stable-diffusion-announcement) by Patrick Esser and Robin Rombach and the Stability AI team.


## Text-to-image Generation

### Single Prompt

Here is how to generate images with one prompt:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -43,10 +43,10 @@ python text_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Multiple Prompts

Here is how to generate images with several prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -61,7 +61,9 @@ python text_to_image_generation.py \
```

### Distributed inference with multiple HPUs

Here is how to generate images with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand Down Expand Up @@ -101,10 +103,10 @@ python text_to_image_generation.py \
```

> There are two different checkpoints for Stable Diffusion 2:
>
> - use [stabilityai/stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1) for generating 768x768 images
> - use [stabilityai/stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) for generating 512x512 images


### Latent Diffusion Model for 3D (LDM3D)

[LDM3D](https://arxiv.org/abs/2305.10853) generates both image and depth map data from a given text prompt, allowing users to generate RGBD images from text prompts.
Expand All @@ -127,7 +129,9 @@ python text_to_image_generation.py \
--ldm3d \
--bf16
```

Here is how to generate images and depth maps with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand All @@ -146,6 +150,7 @@ python ../gaudi_spawn.py \
```

> There are three different checkpoints for LDM3D:
>
> - use [original checkpoint](https://huggingface.co/Intel/ldm3d) to generate outputs from the paper
> - use [the latest checkpoint](https://huggingface.co/Intel/ldm3d-4c) for generating improved results
> - use [the pano checkpoint](https://huggingface.co/Intel/ldm3d-pano) to generate panoramic view
Expand All @@ -155,6 +160,7 @@ python ../gaudi_spawn.py \
Stable Diffusion XL was proposed in [SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis](https://arxiv.org/pdf/2307.01952.pdf) by the Stability AI team.

Here is how to generate SDXL images with a single prompt:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -174,6 +180,7 @@ python text_to_image_generation.py \
> You can enable this mode with `--use_hpu_graphs`.

Here is how to generate SDXL images with several prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -191,6 +198,7 @@ python text_to_image_generation.py \
SDXL combines a second text encoder (OpenCLIP ViT-bigG/14) with the original text encoder to significantly
increase the number of parameters. Here is how to generate images with several prompts for both `prompt`
and `prompt_2` (2nd text encoder), as well as their negative prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -209,6 +217,7 @@ python text_to_image_generation.py \
```

Here is how to generate SDXL images with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand All @@ -227,14 +236,17 @@ python ../gaudi_spawn.py \
--bf16 \
--distributed
```

> HPU graphs are recommended when generating images by batches to get the fastest possible generations.
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.

### SDXL-Turbo

SDXL-Turbo is a distilled version of SDXL 1.0, trained for real-time synthesis.

Here is how to generate images with multiple prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/sdxl-turbo \
Expand Down Expand Up @@ -267,11 +279,13 @@ Before running SD3 pipeline, you need to:

1. Agree to the Terms and Conditions for using SD3 model at [HuggingFace model page](https://huggingface.co/stabilityai/stable-diffusion-3-medium)
2. Authenticate with HuggingFace using your HF Token. For authentication, run:

```bash
huggingface-cli login
```

Here is how to generate SD3 images with a single prompt:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python text_to_image_generation.py \
Expand All @@ -291,12 +305,32 @@ python text_to_image_generation.py \
> For improved performance of the SD3 pipeline on Gaudi, it is recommended to configure the environment
> by setting PT_HPU_MAX_COMPOUND_OP_SIZE to 1.

### FLUX.1

FLUX.1 was was introduced by Black Forest Labs [here](https://blackforestlabs.ai/announcing-black-forest-labs/)

```bash
python text_to_image_generation.py \
--model_name_or_path black-forest-labs/FLUX.1-schnell \
--prompts "A cat holding a sign that says hello world" \
--num_images_per_prompt 10 \
--batch_size 1 \
--num_inference_steps 28 \
--image_save_dir /tmp/flux_1_images \
--scheduler flow_match_euler_discrete\
--use_habana \
--use_hpu_graphs \
--gaudi_config Habana/stable-diffusion \
--bf16
```

## ControlNet

ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models ](https://huggingface.co/papers/2302.05543) by Lvmin Zhang and Maneesh Agrawala.
ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang and Maneesh Agrawala.
It is a type of model for controlling StableDiffusion by conditioning the model with an additional input image.

Here is how to generate images conditioned by canny edge model:

```bash
pip install -r requirements.txt
python text_to_image_generation.py \
Expand All @@ -314,6 +348,7 @@ python text_to_image_generation.py \
```

Here is how to generate images conditioned by canny edge model and with multiple prompts:

```bash
pip install -r requirements.txt
python text_to_image_generation.py \
Expand All @@ -331,6 +366,7 @@ python text_to_image_generation.py \
```

Here is how to generate images conditioned by canny edge model and with two prompts on two HPUs:

```bash
pip install -r requirements.txt
python ../gaudi_spawn.py \
Expand All @@ -350,6 +386,7 @@ python ../gaudi_spawn.py \
```

Here is how to generate images conditioned by open pose model:

```bash
pip install -r requirements.txt
python text_to_image_generation.py \
Expand All @@ -368,6 +405,7 @@ python text_to_image_generation.py \
```

Here is how to generate images with conditioned by canny edge model using Stable Diffusion 2

```bash
pip install -r requirements.txt
python text_to_image_generation.py \
Expand All @@ -392,6 +430,7 @@ Inpainting replaces or edits specific areas of an image. For more details,
please refer to [Hugging Face Diffusers doc](https://huggingface.co/docs/diffusers/en/using-diffusers/inpaint).

### Stable Diffusion Inpainting

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-2-inpainting \
Expand All @@ -409,6 +448,7 @@ python text_to_image_generation.py \
```

### Stable Diffusion XL Inpainting

```bash
python text_to_image_generation.py \
--model_name_or_path diffusers/stable-diffusion-xl-1.0-inpainting-0.1\
Expand Down Expand Up @@ -455,10 +495,10 @@ python image_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Multiple Prompts

Here is how to generate images with several prompts and one image.

```bash
pip install -r requirements.txt
python image_to_image_generation.py \
Expand All @@ -481,10 +521,10 @@ python image_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Stable Diffusion XL Refiner

Here is how to generate SDXL images with a single prompt and one image:

```bash
pip install -r requirements.txt
python image_to_image_generation.py \
Expand All @@ -505,6 +545,7 @@ python image_to_image_generation.py \
### Stable Diffusion Image Variations

Here is how to generate images with one image, it does not accept prompt input

```bash
pip install -r requirements.txt
python image_to_image_generation.py \
Expand Down Expand Up @@ -565,6 +606,7 @@ Script `image_to_video_generation.py` showcases how to perform image-to-video ge
### Single Image Prompt

Here is how to generate video with one image prompt:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python image_to_video_generation.py \
Expand All @@ -585,6 +627,7 @@ python image_to_video_generation.py \
### Multiple Image Prompts

Here is how to generate videos with several image prompts:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python image_to_video_generation.py \
Expand Down
34 changes: 30 additions & 4 deletions examples/stable-diffusion/text_to_image_generation.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
GaudiDDIMScheduler,
GaudiEulerAncestralDiscreteScheduler,
GaudiEulerDiscreteScheduler,
GaudiFlowMatchEulerDiscreteScheduler
)
from optimum.habana.utils import set_seed

Expand Down Expand Up @@ -65,7 +66,7 @@ def main():
parser.add_argument(
"--scheduler",
default="ddim",
choices=["default", "euler_discrete", "euler_ancestral_discrete", "ddim"],
choices=["default", "euler_discrete", "euler_ancestral_discrete", "ddim", "flow_match_euler_discrete"],
type=str,
help="Name of scheduler",
)
Expand Down Expand Up @@ -275,13 +276,16 @@ def main():
# Select stable diffuson pipeline based on input
sdxl_models = ["stable-diffusion-xl", "sdxl"]
sd3_models = ["stable-diffusion-3"]
flux_models = ["FLUX.1-dev", "FLUX.1-schnell"]
sdxl = True if any(model in args.model_name_or_path for model in sdxl_models) else False
sd3 = True if any(model in args.model_name_or_path for model in sd3_models) else False
flux = True if any(model in args.model_name_or_path for model in flux_models) else False
controlnet = True if args.control_image is not None else False
inpainting = True if (args.base_image is not None) and (args.mask_image is not None) else False

# Set the scheduler
kwargs = {"timestep_spacing": args.timestep_spacing}

if args.scheduler == "euler_discrete":
scheduler = GaudiEulerDiscreteScheduler.from_pretrained(
args.model_name_or_path, subfolder="scheduler", **kwargs
Expand All @@ -292,6 +296,10 @@ def main():
)
elif args.scheduler == "ddim":
scheduler = GaudiDDIMScheduler.from_pretrained(args.model_name_or_path, subfolder="scheduler", **kwargs)
elif args.scheduler == "flow_match_euler_discrete":
scheduler = GaudiFlowMatchEulerDiscreteScheduler.from_pretrained(
args.model_name_or_path, subfolder="scheduler", **kwargs
)
else:
scheduler = None

Expand Down Expand Up @@ -340,16 +348,18 @@ def main():
negative_prompts = negative_prompt
kwargs_call["negative_prompt"] = negative_prompts

if sdxl or sd3:
if sdxl or sd3 or flux:
prompts_2 = args.prompts_2
negative_prompts_2 = args.negative_prompts_2
if args.distributed and args.prompts_2 is not None:
with distributed_state.split_between_processes(args.prompts_2) as prompt_2:
prompts_2 = prompt_2
kwargs_call["prompt_2"] = prompts_2

if sdxl or sd3:
negative_prompts_2 = args.negative_prompts_2
if args.distributed and args.negative_prompts_2 is not None:
with distributed_state.split_between_processes(args.negative_prompts_2) as negative_prompt_2:
negative_prompts_2 = negative_prompt_2
kwargs_call["prompt_2"] = prompts_2
kwargs_call["negative_prompt_2"] = negative_prompts_2

if sd3:
Expand Down Expand Up @@ -428,6 +438,22 @@ def main():
args.model_name_or_path,
**kwargs,
)
elif flux:
# Flux pipelines
if controlnet:
# Import Flux+ControlNet pipeline
raise ValueError("Flux+ControlNet pipeline is not currenly supported")
elif inpainting:
# Import FLux Inpainting pipeline
raise ValueError("Flux Inpainting pipeline is not currenly supported")
else:
# Import Flux pipeline
from optimum.habana.diffusers import GaudiFluxPipeline

pipeline = GaudiFluxPipeline.from_pretrained(
args.model_name_or_path,
**kwargs,
)

else:
# SD pipelines (SD1.x, SD2.x)
Expand Down
3 changes: 2 additions & 1 deletion optimum/habana/diffusers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from .pipelines.auto_pipeline import AutoPipelineForInpainting, AutoPipelineForText2Image
from .pipelines.controlnet.pipeline_controlnet import GaudiStableDiffusionControlNetPipeline
from .pipelines.ddpm.pipeline_ddpm import GaudiDDPMPipeline
from .pipelines.flux.pipeline_flux import GaudiFluxPipeline
from .pipelines.pipeline_utils import GaudiDiffusionPipeline
from .pipelines.stable_diffusion.pipeline_stable_diffusion import GaudiStableDiffusionPipeline
from .pipelines.stable_diffusion.pipeline_stable_diffusion_depth2img import GaudiStableDiffusionDepth2ImgPipeline
Expand All @@ -20,4 +21,4 @@
from .pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl_inpaint import GaudiStableDiffusionXLInpaintPipeline
from .pipelines.stable_video_diffusion.pipeline_stable_video_diffusion import GaudiStableVideoDiffusionPipeline
from .pipelines.text_to_video_synthesis.pipeline_text_to_video_synth import GaudiTextToVideoSDPipeline
from .schedulers import GaudiDDIMScheduler, GaudiEulerAncestralDiscreteScheduler, GaudiEulerDiscreteScheduler
from .schedulers import GaudiDDIMScheduler, GaudiEulerAncestralDiscreteScheduler, GaudiEulerDiscreteScheduler, GaudiFlowMatchEulerDiscreteScheduler
4 changes: 4 additions & 0 deletions optimum/habana/diffusers/pipelines/auto_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@
from .stable_diffusion.pipeline_stable_diffusion_inpaint import GaudiStableDiffusionInpaintPipeline
from .stable_diffusion_xl.pipeline_stable_diffusion_xl import GaudiStableDiffusionXLPipeline
from .stable_diffusion_xl.pipeline_stable_diffusion_xl_inpaint import GaudiStableDiffusionXLInpaintPipeline
from .stable_diffusion_3.pipeline_stable_diffusion_3 import GaudiStableDiffusion3Pipeline
from .flux.pipeline_flux import GaudiFluxPipeline


GAUDI_PREFIX_NAME = "Gaudi"
Expand All @@ -42,6 +44,8 @@
("stable-diffusion", GaudiStableDiffusionPipeline),
("stable-diffusion-xl", GaudiStableDiffusionXLPipeline),
("stable-diffusion-controlnet", GaudiStableDiffusionControlNetPipeline),
("stable-diffusion-3", GaudiStableDiffusion3Pipeline),
("flux", GaudiFluxPipeline),
]
)

Expand Down
Loading