Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于用于提取图像特征编码器 #51

Open
chaoying0115 opened this issue Mar 31, 2024 · 2 comments
Open

关于用于提取图像特征编码器 #51

chaoying0115 opened this issue Mar 31, 2024 · 2 comments

Comments

@chaoying0115
Copy link

chaoying0115 commented Mar 31, 2024

非常感谢团队的出色工作。论文中有提到将repvit用于depth antything 编码器得到指标提升。
我将repvit用于单目深度估计模型当中,把新的repvit骨干输出的图像shape,进行下采样、切片操作然后输入进去原来模型,与baseline(编码器为2022 cvpr mpvit)相比,指标仍有较大差距。
image

感觉是通道数设计的问题,即baseline的通道设计可能并不是最匹配repvit的,想请教一下repvit作为编码器时通道设计有什么需要注意的吗?或者有什么推荐阅读的材料和改进方向?

这是我对repvit输出的操作
image

这是baseline编码器解码器的通道数
image

非常期待得到您的回复,万分感谢!

@jameslahm
Copy link
Collaborator

Thanks for your interest. We thought that the padding and slice operations for channels may impair the performance. And we suggest that you could introduce extra projects layers, \eg, 1*1 convolution layers, to align the number of channels in the RepViT feature map with the number of channels you want, rather than directly padding or slicing channels.

@chaoying0115
Copy link
Author

ok thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants