Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ella can sometimes make already-correct results much less correct #35

Open
Akira13641 opened this issue Apr 22, 2024 · 3 comments
Open

Comments

@Akira13641
Copy link

RealCartoon 3D V15

Princess Peach is standing next to Tifa Lockhart, they are outside on a summer day, they are wearing bikinis. high quality, best quality, masterpiece

Without Ella:
ComfyUI_14055_

With ELLA, same seed:
ComfyUI_14056_

Using Ella in this case turns Princess Peach into a random pink-haired girl instead of the recognizable character.

@budui
Copy link
Collaborator

budui commented Apr 23, 2024

During ELLA's training, a large number of synthetic captions were used, which typically do not include names or character names. Therefore, if your prompt contains a name, ELLA's performance is very poor. You can try replacing the name with 'a woman' and concatenate the output of CLIP.

@jyoung105
Copy link

@budui Thanks for explanation. And this is what I think there should be tricks to generate the better synthetic captions or mixing short captions and synthetic long captions.

@andupotorac
Copy link

During ELLA's training, a large number of synthetic captions were used, which typically do not include names or character names. Therefore, if your prompt contains a name, ELLA's performance is very poor. You can try replacing the name with 'a woman' and concatenate the output of CLIP.

We want to prevent this issue on our end as well. What did you mean when you said "concatenate the output of CLIP"? Where would the name be introduced again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants