Distribution-Aware Prompt Tuning for Vision-Language Models

Official pytorch implementation of "Distribution-Aware Prompt Tuning for Vision-Language Models" (ICCV 2023).

Setup

Clone repository

git clone https://github.com/mlvlab/DAPT.git
cd DAPT

Prepare dataset

Follow DATASET.md to install the datasets.

Setup conda environment

Before creating the environment, you should modify appropriate conda path in env.yaml

conda env create —-file env.yaml
conda activate dapt
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

Setup Dassl.pytorch package

cd Dassl.pytorch
python setup.py develop
cd ..

Run

Dataset path setting

Modify the data path $DATA in main.sh, gen_prototype.sh, and eval.sh to match the path to the dataset you downloaded.

Generate prototype

When the dataset is ready, you can generate the prototype as follows.

bash scripts/gen_prototype.sh [gpu_id]

Few-shot image classification

Below is an example of Caltech101 for each shot.

Note that for ImageNet, we use configs/trainers/DAPT/vit_b16_ep50.yaml for all settings following CoOp.

# 1shot
bash scripts/main.sh caltech101 1 [gpu_id]
# 2shots
bash scripts/main.sh caltech101 2 [gpu_id]
# 4shots
bash scripts/main.sh caltech101 4 [gpu_id]
# 8shots
bash scripts/main.sh caltech101 8 [gpu_id]
# 16shots
bash scripts/main.sh caltech101 16 [gpu_id]

Domain generalization

Before domain generalization, you should completed few-shot image classification on ImageNet.

After the few-shot image classification experiment on ImageNet is finished, you can load the model learned on ImageNet using --eval-only command to conduct domain generalization on imagenetv2, imagenet-sketch, imagenet-a, and imagenet-r.

bash scripts/eval.sh [gpu_id]

Acknowledgement

This repository is built upon Dassl.pytorch, CoOp, and VPT. We thank the authors for their code.

Citation

If you use this code in your research, please kindly cite the following paper:

@InProceedings{Cho_2023_ICCV,
    author    = {Cho, Eulrang and Kim, Jooyeon and Kim, Hyunwoo J},
    title     = {Distribution-Aware Prompt Tuning for Vision-Language Models},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {22004-22013}
}

License

Licensed under MIT License

Copyright (c) 2023 MLV Lab (Machine Learning and Vision Lab at Korea University)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Dassl.pytorch		Dassl.pytorch
asset		asset
clip		clip
configs		configs
datasets		datasets
scripts		scripts
trainers		trainers
DATASET.md		DATASET.md
LICENSE		LICENSE
README.md		README.md
env.yaml		env.yaml
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distribution-Aware Prompt Tuning for Vision-Language Models

Setup

Clone repository

Prepare dataset

Setup conda environment

Setup Dassl.pytorch package

Run

Dataset path setting

Generate prototype

Few-shot image classification

Domain generalization

Acknowledgement

Citation

License

About

Releases

Packages

Contributors 2

Languages

License

mlvlab/DAPT

Folders and files

Latest commit

History

Repository files navigation

Distribution-Aware Prompt Tuning for Vision-Language Models

Setup

Clone repository

Prepare dataset

Setup conda environment

Setup Dassl.pytorch package

Run

Dataset path setting

Generate prototype

Few-shot image classification

Domain generalization

Acknowledgement

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages