Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting Stable Video Diffusion ControNet to HPU #1037

Merged
merged 3 commits into from
Oct 3, 2024

Conversation

wenbinc-Bin
Copy link
Contributor

Enable Stable-Video-Diffusion ControNet on Gaudi

@wenbinc-Bin wenbinc-Bin requested a review from regisss as a code owner June 4, 2024 04:50
@wenbinc-Bin wenbinc-Bin marked this pull request as draft June 4, 2024 04:50
@emascarenhas
Copy link
Contributor

Please sync your PR with main/upstream and fix any merge conflicts. Thank you.

@wenbinc-Bin wenbinc-Bin marked this pull request as ready for review September 4, 2024 02:45
@yafshar
Copy link
Contributor

yafshar commented Sep 6, 2024

@dsocek have you reviewed this PR? If you are done, I can start

if isinstance(args.image_path, str):
args.image_path = [args.image_path]
for image_path in args.image_path:
print(image_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(image_path)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks.

type=int,
default=25,
help="The number of video frames to generate."
)
args = parser.parse_args()

from optimum.habana.diffusers import GaudiStableVideoDiffusionPipeline
Copy link
Contributor

@yafshar yafshar Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wenbinc-Bin can you move the import to the top?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks.

from optimum.habana.diffusers import GaudiStableVideoDiffusionPipelineControlNet
from optimum.habana.diffusers.models import ControlNetSDVModel
from optimum.habana.diffusers.models import UNetSpatioTemporalConditionControlNetModel
controlnet = controlnet = ControlNetSDVModel.from_pretrained(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
controlnet = controlnet = ControlNetSDVModel.from_pretrained(
controlnet = ControlNetSDVModel.from_pretrained(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks.

Comment on lines 249 to 250
# Set seed before running the model
set_seed(args.seed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Set seed before running the model
set_seed(args.seed)

Isn't it better to set the random seed before loading the models before the conditional?
Personal suggestion! I think this ensures that any randomness involved in the model loading process (such as weight initialization for certain layers) is controlled, leading to reproducible results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks.

Comment on lines 276 to 277
# Set seed before running the model
set_seed(args.seed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Set seed before running the model
set_seed(args.seed)

same as above!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks.

"--controlnet_model_name_or_path",
default="CiaraRowles/temporal-controlnet-depth-svd-v1",
type=str,
help="Path to pre-trained controlnet model",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="Path to pre-trained controlnet model",
help="Path to pre-trained controlnet model.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks.

type=str,
default=None,
nargs="*",
help="Path to controlnet input image(s) to guide video generation",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="Path to controlnet input image(s) to guide video generation",
help="Path to controlnet input image(s) to guide video generation.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks.

@yafshar
Copy link
Contributor

yafshar commented Sep 6, 2024

@wenbinc-Bin can you check your example in the README. I am unable to run the command and am getting ImportError

>>> python -c "from optimum.habana.diffusers import GaudiEulerDiscreteScheduler"
/usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:20: FutureWarning: `VQEncoderOutput` is deprecated and will be removed in version 0.31. Importing `VQEncoderOutput` from `diffusers.models.vq_model` is deprecated and this will be removed in a future version. Please use `from diffusers.models.autoencoders.vq_model import VQEncoderOutput`, instead.
  deprecate("VQEncoderOutput", "0.31", deprecation_message)
/usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:25: FutureWarning: `VQModel` is deprecated and will be removed in version 0.31. Importing `VQModel` from `diffusers.models.vq_model` is deprecated and this will be removed in a future version. Please use `from diffusers.models.autoencoders.vq_model import VQModel`, instead.
  deprecate("VQModel", "0.31", deprecation_message)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/root/optimum-habana/optimum/habana/diffusers/__init__.py", line 21, in <module>
    from .pipelines.controlnet.pipeline_stable_video_diffusion_controlnet import GaudiStableVideoDiffusionPipelineControlNet
  File "/root/optimum-habana/optimum/habana/diffusers/pipelines/controlnet/pipeline_stable_video_diffusion_controlnet.py", line 25, in <module>
    from diffusers.pipelines.stable_video_diffusion.pipeline_stable_video_diffusion import (
ImportError: cannot import name 'tensor2vid' from 'diffusers.pipelines.stable_video_diffusion.pipeline_stable_video_diffusion' (/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py)

It sounds like tensor2vid being replaced. See huggingface/diffusers#9254

@yafshar
Copy link
Contributor

yafshar commented Sep 6, 2024

@wenbinc-Bin, please also run make style and fix the issues!

@yafshar
Copy link
Contributor

yafshar commented Sep 6, 2024

@wenbinc-Bin please fix the port, README and style, so I can continue reviewing the PR

@wenbinc-Bin
Copy link
Contributor Author

It works on diffusers==0.29.2 now.

@wenbinc-Bin
Copy link
Contributor Author

Also fixed the issue reported by 'make style'.

down_block_additional_residuals: Optional[Tuple[torch.Tensor]] = None,
mid_block_additional_residual: Optional[torch.Tensor] = None,
return_dict: bool = True,
added_time_ids: torch.Tensor = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
added_time_ids: torch.Tensor = None,
added_time_ids: Optional[torch.Tensor] = None,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks

def __call__(
self,
image: Union[PIL.Image.Image, List[PIL.Image.Image], torch.FloatTensor],
controlnet_condition: [torch.FloatTensor] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
controlnet_condition: [torch.FloatTensor] = None,
controlnet_condition: Optional[torch.FloatTensor] = None,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks.

@yafshar
Copy link
Contributor

yafshar commented Sep 10, 2024

@wenbinc-Bin is there a reason you have removed some functionalities in pipeline_stable_video_diffusion_controlnet.py compare to the original version https://github.com/CiaraStrawberry/svd-temporal-controlnet/blob/765cd95c3659c54593ae36a9616121f00b3d7c29/pipeline/pipeline_stable_video_diffusion_controlnet.py#L99

I see some differences, and appreciate it if you can clarify those.

@yafshar
Copy link
Contributor

yafshar commented Sep 10, 2024

@wenbinc-Bin thanks for this contribution. Would you please add a test for this PR? If you can add tests, we can wrap up this PR.

@@ -1,5 +1,8 @@
from .pipelines.auto_pipeline import AutoPipelineForInpainting, AutoPipelineForText2Image
from .pipelines.controlnet.pipeline_controlnet import GaudiStableDiffusionControlNetPipeline
from .pipelines.controlnet.pipeline_stable_video_diffusion_controlnet import (
GaudiStableVideoDiffusionPipelineControlNet,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GaudiStableVideoDiffusionPipelineControlNet,
GaudiStableVideoDiffusionControlNetPipeline,

just a personal suggestion to be compliant with other naming like GaudiStableDiffusionControlNetPipeline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

logger = logging.get_logger(__name__) # pylint: disable=invalid-name


class GaudiStableVideoDiffusionPipelineControlNet(GaudiStableVideoDiffusionPipeline):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class GaudiStableVideoDiffusionPipelineControlNet(GaudiStableVideoDiffusionPipeline):
class GaudiStableVideoDiffusionControlNetPipeline(GaudiStableVideoDiffusionPipeline):

just a personal suggestion to be compliant with other naming like GaudiStableDiffusionControlNetPipeline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

**kwargs,
)
if args.control_image_path is not None:
from optimum.habana.diffusers import GaudiStableVideoDiffusionPipelineControlNet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from optimum.habana.diffusers import GaudiStableVideoDiffusionPipelineControlNet
from optimum.habana.diffusers import GaudiStableVideoDiffusionControlNetPipeline

just a personal suggestion to be compliant with other naming like GaudiStableDiffusionControlNetPipeline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

set_seed(args.seed)
controlnet = ControlNetSDVModel.from_pretrained(args.controlnet_model_name_or_path, subfolder="controlnet")
unet = UNetSpatioTemporalConditionControlNetModel.from_pretrained(args.model_name_or_path, subfolder="unet")
pipeline = GaudiStableVideoDiffusionPipelineControlNet.from_pretrained(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pipeline = GaudiStableVideoDiffusionPipelineControlNet.from_pretrained(
pipeline = GaudiStableVideoDiffusionControlNetPipeline.from_pretrained(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

@yafshar
Copy link
Contributor

yafshar commented Sep 10, 2024

@wenbinc-Bin the naming change is just a personal suggestion to be compliant with other naming like GaudiStableDiffusionControlNetPipeline, if you do not agree, please ignore the changes! thanks

@yafshar
Copy link
Contributor

yafshar commented Sep 11, 2024

@wenbinc-Bin would you please respond to the comments so we can finish this PR faster.

@wenbinc-Bin
Copy link
Contributor Author

wenbinc-Bin commented Sep 12, 2024

@wenbinc-Bin is there a reason you have removed some functionalities in pipeline_stable_video_diffusion_controlnet.py compare to the original version https://github.com/CiaraStrawberry/svd-temporal-controlnet/blob/765cd95c3659c54593ae36a9616121f00b3d7c29/pipeline/pipeline_stable_video_diffusion_controlnet.py#L99

I see some differences, and appreciate it if you can clarify those.

These functions are also in base class "StableVideoDiffusionPipeline" and they are basically same. I remove these functions to reduce redundant code.

@wenbinc-Bin
Copy link
Contributor Author

@wenbinc-Bin the naming change is just a personal suggestion to be compliant with other naming like GaudiStableDiffusionControlNetPipeline, if you do not agree, please ignore the changes! thanks

I agree to change the name. Thanks for your advice.

@wenbinc-Bin
Copy link
Contributor Author

@wenbinc-Bin would you please respond to the comments so we can finish this PR faster.

Sorry, I take some time to add test case. I am not familiar with this part before.

@wenbinc-Bin
Copy link
Contributor Author

@wenbinc-Bin thanks for this contribution. Would you please add a test for this PR? If you can add tests, we can wrap up this PR.

I add test case and update the PR.

Copy link
Contributor

@yafshar yafshar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice contribution.

LGTM!

@regisss this PR is ready, please check it.

@yafshar
Copy link
Contributor

yafshar commented Sep 18, 2024

@libinta would you please label this PR

@libinta libinta added the run-test Run CI for PRs from external contributors label Sep 18, 2024
Copy link

The code quality check failed, please run make style.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@regisss regisss merged commit d613e06 into huggingface:main Oct 3, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants