About VAE adaption #117

GengDavid · 2024-09-14T12:32:43Z

Hi, @yunkchen. I noticed you use 10M images from the SAM dataset to adapt the HunyuanDiT with VAE. However, SAM images do not have text annotation. How do you use it to adapt HunyuanDiT? I think it may degenerate the ability of T2I.

GengDavid · 2024-09-20T01:02:32Z

Hi @yunkchen, I think it is necessary to contact you again about the VAE adaption. I tried to fine-tune the transformer at 256x256, but it seems that HunYuanDiT does not provide any initialization for resolution 256. After training, the image generation quality is largely degraded (either at resolution 256 or resolution 1024). Any suggestion on this? Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About VAE adaption #117

About VAE adaption #117

GengDavid commented Sep 14, 2024

GengDavid commented Sep 20, 2024

About VAE adaption #117

About VAE adaption #117

Comments

GengDavid commented Sep 14, 2024

GengDavid commented Sep 20, 2024