GitHub - YasuoFly/ThemeRecognition: few-shot adaptaion for CLIP-based image recognition

TEG: image theme recognition using text-embedding-guided few-shot adaptation

Jikai Wang, Wanglong Lu, Yu Wang, Kaijie Shi, Xianta Jiang, Hanli Zhao*

Journal of Electronic Imaging (https://doi.org/10.1117/1.JEI.33.1.013028)

Abstract

Grouping images into different themes is a challenging task in photo book curation. Unlike image object recognition, image theme recognition focuses on the understanding of the main subject or overall meaning conveyed by an image. However, it is challenging to achieve satisfactory performance using existing general image recognition methods. In this work, we aim to solve the image theme recognition task with few-shot training samples using pre-trained contrastive language-image models. A text-prompt-guided few-shot image adaptation framework is proposed, which incorporates a text-embedding-guided classifier and an auxiliary classification loss to exploit embedded visual and text features, stabilize the network training, and enhance recognition performance. We also present an annotated dataset Theme25 for studying image theme recognition. We conducted experiments on our Theme25 dataset as well as the publicly available CIFAR100 and ImageNet datasets to demonstrate the superiority of our method over the compared state-of-the-art methods.

Project benefits

We are the few-shot adaptation method for image recognition.
Our codes can train using a batch size of 32 on a GPU with less than 8GB memory.
Our model converges very fast, especially on few-shot datasets. It takes only around 20 minutes for few-shot datasets (e.g., 64-shot).

1. Main Environments.
The environment installation procedure can follow the steps below (python=3.10):

git clone https://github.com/YasuoFly/ThemeRecognition.git
cd ThemeRecognition
conda create -n teg python=3.10
conda activate teg
pip install -r requirements.txt

2. Datasets.
Theme25 Dataset can be downloaded from the link: Theme25

For more detailed Theme25 dataset information, please refer to the documentation Dataset information.

3. Train the TEG.

python train.py --data_path /path/to/Theme25 --shot 1 --seed 1

After training, you could obtain the log file in './log/' and the checkpoint file in './checkpoint/'

4. Test the TEG.

python test.py --load_pre_path /path/to/checkpoint

The code was reorganized, so the performance may differ slightly.

Citation

If you find this repository helpful, please consider citing:

@article{wang2024teg,
author = {Jikai Wang and Wanglong Lu and Yu Wang and Kaijie Shi and Xianta Jiang and Hanli Zhao},
title = {{TEG: image theme recognition using text-embedding-guided few-shot adaptation}},
journal = {Journal of Electronic Imaging},
year = {2024},
}

Acknowledgements

We really appreciate the awesome projects, as shown below. Please check them as well. Our method is based on CLIP. Our dataset (Theme25) is based on ClipCap, ImageNet, and CIFAR100.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
assets		assets
clip		clip
utils		utils
DATA_README.md		DATA_README.md
README.md		README.md
model.py		model.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TEG: image theme recognition using text-embedding-guided few-shot adaptation

Abstract

Project benefits

Citation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

YasuoFly/ThemeRecognition

Folders and files

Latest commit

History

Repository files navigation

TEG: image theme recognition using text-embedding-guided few-shot adaptation

Abstract

Project benefits

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages