Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About VAE adaption #117

Open
GengDavid opened this issue Sep 14, 2024 · 1 comment
Open

About VAE adaption #117

GengDavid opened this issue Sep 14, 2024 · 1 comment

Comments

@GengDavid
Copy link

Hi, @yunkchen. I noticed you use 10M images from the SAM dataset to adapt the HunyuanDiT with VAE. However, SAM images do not have text annotation. How do you use it to adapt HunyuanDiT? I think it may degenerate the ability of T2I.

@GengDavid
Copy link
Author

Hi @yunkchen, I think it is necessary to contact you again about the VAE adaption. I tried to fine-tune the transformer at 256x256, but it seems that HunYuanDiT does not provide any initialization for resolution 256. After training, the image generation quality is largely degraded (either at resolution 256 or resolution 1024). Any suggestion on this? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant