Skip to content

Use reference points to improve interpolation paths in the CLIP latent space.

Notifications You must be signed in to change notification settings

traberph/CLIP-Interpolation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reference Point-Based Interpolation of CLIP Embeddings for Controlling Text-To-Image Generation

Abstract

This thesis investigates interpolation methods within the context of text- to-image generation, focusing on the latent space of the CLIP (Contrastive Language-Image Pretraining) model. Our work explores the effectiveness of linear interpolation (lerp) and spherical linear interpolation (slerp) in generat- ing coherent and smooth transitions between text prompts. Results indicate that slerp outperforms lerp, particularly with complex prompts, by producing more visually coherent images. Additionally, a novel reference-based interpo- lation method is introduced, leveraging cosine similarity to guide the inter- polation path through the latent space. While manual selection of reference points demonstrated improved interpolation quality, automatic selection meth- ods showed varying levels of success. Despite these advancements, limitations related to dataset quality and the initial embeddings were identified, highlighting areas for future research. The findings contribute to the broader understanding of interpolation methods in multimodal AI, offering insights into sampling from text-to-image generation models

Structure

  • 00 interpolates CLIP embeddings using lerp and slerp and generates images from the interpolated embeddings
  • 01 evaluates the generated images from 00\
  • 02 uses manual selected prompts to improve the interpolation quality\
  • 03 evaluates the generated images from 02
  • 04 for experiments with the EOT Token of CLIP
  • 05 selects the reference points for the interpolation automatically
  • 06 evaluates the generated images from 05

Some Results

using slerp and reference points

out5_semiautomatic_0 out5_semiautomatic_1 out5_semiautomatic_2

using slerp and reference points

experiment5_manual_0 experiment5_manual_1 experiment5_manual_2

About

Use reference points to improve interpolation paths in the CLIP latent space.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published