Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The right way of using SD XL motion model to get good quality output #382

Open
iyinchao opened this issue Sep 10, 2024 · 4 comments
Open

Comments

@iyinchao
Copy link

iyinchao commented Sep 10, 2024

Hi, first I'm very grateful for this wonderful work, animatediff is really awesome 👍

I got stucked in the quality issue for several days, when I use the sdxl motion model. Although the motion is very nice, the video quality seems to be quite low, looks like pixelated or downscaled. Here is the comparation of sdxl image and animatediff frame:

Original image by Animagine XL Animatediff SD XL Frame
172115_preview_00001_ animatediff_sdxl_frame

These two images are using the same size configuration. I'm using the comfyUI workflow adopted here: https://civitai.com/articles/2950, with Animagine XL V3.1 model & vae(you can save the image below and import in comfyui):

workflow (1)

I tried with different number of steps / with&height settings / sampler / guidance, but got no luck.

I know the sdxl motion model is still in beta, but I can't get the same good result as the example in Readme. Is there anything I'm doing wrong here 😢 Could anyone show the right way of using the sdxl model? Thank you in advance.

@F0xbite
Copy link

F0xbite commented Sep 10, 2024

You're not doing anything wrong. The SDXL beta motion model is just pure garbage. We're all in the same boat with these kind of XL results. I tried experimenting with video upscaling, but even then the quality of the results were just not as good as what we get from the 1.5 v3 motion model. If i had any understanding of how, I would train my own.

I worked around this by making a hybrid xl/sd1.5 workflow that generates an image with XL and uses 1.5 ip adapter. The detail isn't the same as XL, but the quality of the animation itself is far better. I'm attaching a comparison of 2 animations using the same parameters with the source image used.
XL Result
Foxphoria_Wild_Animation_774500414548606987_20240910062645452359_43_53
Hybrid XL/1.5 Result
Foxphoria_Wild_Animation_774500414548606987_20240910062757704356_38_72
Source
Foxphoria_Image_774500414548606987_20240908161722377241_4_23

@iyinchao
Copy link
Author

@F0xbite Thank you for the information!
I also tried the animatediff 1.5/2 motion models, which is way better. Your solution is very enlightening 👍 I'm not going to waste time on the sdxl model.
BTW, is there any other motion model better work with SD XL?

@F0xbite
Copy link

F0xbite commented Sep 11, 2024

@F0xbite Thank you for the information! I also tried the animatediff 1.5/2 motion models, which is way better. Your solution is very enlightening 👍 I'm not going to waste time on the sdxl model. BTW, is there any other motion model better work with SD XL?

Glad to help. The only other one that I know of is HotshotXL. Hotshot does have better visible quality, but it's limited to 8 rendered frames max and I don't think it's possible to loop context, both of which are huge caveats for me. Also the quality of the motion seems rather poor and distorted in my testing, but that's just my opinion.

There's also SVD, but it's strictly a image->video model with no prompting and basically no control over motion.

So unfortunately, I don't know of a better solution than the hybrid system I'm using now, until a better motion model is trained for XL or the Flux team releases some kind of text2video model. But I'm sure that's bound to change at some point.

@biswaroop1547
Copy link

thanks a lot for sharing this @F0xbite, would love to use your hybrid workflow above if you could share 🔥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants