Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add ViT weights: RADIO #2177

Open
seefun opened this issue May 14, 2024 · 2 comments
Open

[FEATURE] Add ViT weights: RADIO #2177

seefun opened this issue May 14, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@seefun
Copy link
Contributor

seefun commented May 14, 2024

https://github.com/NVlabs/RADIO

The code and model weights of paper [CVPR 2024] AM-RADIO: Agglomerative Vision Foundation Model - Reduce All Domains Into One has been released by Nvidia

RADIO , a new vision foundation model (actually a new vit pretrained weight), excels across visual domains, serving as a superior replacement for vision backbones. Integrating CLIP variants, DINOv2, and SAM through distillation, it preserves unique features like text grounding and segmentation correspondence.

image

@seefun seefun added the enhancement New feature or request label May 14, 2024
@NightMachinery
Copy link
Contributor

Does RADIO have ImageNet-1k heads?

@seefun
Copy link
Contributor Author

seefun commented Oct 12, 2024

Does RADIO have ImageNet-1k heads?

I haven't seen it yet. But I notice the new RADIOv2.5 model is released, which merged knowledge from DFN CLIP, DINOv2, SigLIP, and SAM through multi-teacher distillation. It looks very practical in downstream task.
https://github.com/NVlabs/RADIO/blob/main/RADIOv2.5_tech_report.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants