Skip to content

Pter61/vlpmarker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 

Repository files navigation

VLPMarker

The full code will be public after our paper is accepted.

Demo

Demo for zero-shot evaluation and user identification with VLPMarker

Install Benchmarks

pip install benchmarks/CLIP_benchmark

Download the checkpoint of VLPMarker

The model is available in GoogleDrive.

Sample running code for zero-shot evaluation with VLPMarker:

# zero-shot retrieval 
clip_benchmark eval --model ViT-L-14 \
                    --pretrained laion2b_s32b_b82k  \
                    --dataset=multilingual_mscoco_captions \
                    --output=result.json --batch_size=64 \
                    --language=en --trigger_num=1024 \
                    --watermark_dim=768 \
                    --watermark_dir "path/to/watermark.pth"
                    
# zero-shot classification 
clip_benchmark eval --dataset=imagenet1k \
                    --pretrained=openai \
                    --model=ViT-L-14 \
                    --output=result.json \
                    --batch_size=64 \
                    --trigger_num=1024 \
                    --watermark_dim=768 \
                    --watermark_dir "path/to/watermark.pth"

Zero-shot evaluation without VLPMarker

First, download the original CLIP_benchmark

pip uninstall CLIP_benchmark
pip install CLIP_benchmark

Then, evaluate the original CLIP models.

# zero-shot retrieval 
clip_benchmark eval --model ViT-L-14 \
                    --pretrained laion2b_s32b_b82k  \
                    --dataset=multilingual_mscoco_captions \
                    --output=result.json --batch_size=64 \
                    --language=en
                    
# zero-shot classification 
clip_benchmark eval --dataset=imagenet1k \
                    --pretrained=openai \
                    --model=ViT-L-14 \
                    --output=result.json \
                    --batch_size=64 

Notely, different checkpoints of CLIP result in different performances, which are summarised in here.

Citing

If you found this repository useful, please consider citing:

@misc{tang2023watermarking,
      title={Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service}, 
      author={Yuanmin Tang and Jing Yu and Keke Gai and Xiangyan Qu and Yue Hu and Gang Xiong and Qi Wu},
      year={2023},
      eprint={2311.05863},
      archivePrefix={arXiv},
      primaryClass={cs.CR}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published