Skip to content

Enhancing Zero-shot Image Retrieval with Vision Foundation Models

License

Notifications You must be signed in to change notification settings

zhumorui/AnyRetrival

Repository files navigation

AnyRetrival: Enhancing Zero-shot Image Retrieval with Vision Foundation Models

This repository evaluates the performance of DINOv2 models on image retrieval tasks using the ROxford5k and RParis6k datasets. DINOv2 models leverage self-supervised learning to extract robust and generalizable features without relying on labeled data. The results show that DINOv2 achieves state-of-the-art performance in various retrieval scenarios, particularly excelling in high-quality and moderate-difficulty tasks. By supporting flexible input resolutions (224x224 and 448x448) and scalable model sizes (from vits14 to vitg14), DINOv2 adapts effectively to diverse use cases, offering a balance between computational efficiency and retrieval accuracy. Notably, DINOv2 surpasses traditional supervised models like DELG in easy scenarios while demonstrating competitive performance in hard retrieval challenges.

Acknowledgements

We thank the authors of the following repositories for their contributions to this project:

About

Enhancing Zero-shot Image Retrieval with Vision Foundation Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages