ViT？ #271

sjjadsa · 2024-09-16T09:10:02Z

After performing feature extraction, can we use a vision transformer to process those features? By asking this, I'm specifically referring to whether it's possible to apply position embedding.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ViT？ #271

ViT？ #271

sjjadsa commented Sep 16, 2024

ViT？ #271

ViT？ #271

Comments

sjjadsa commented Sep 16, 2024