-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train using custom dataset #17
Comments
Yes, currently PoET does not support the training of the backbone. We intended PoET to be an extension to any pre-trained backbone. If you want to use Mask R-CNN as the backbone for training on your custom dataset, you should first pre-train it separately on your dataset for object detection. Once you have the network trained you can include the pre-trained weights in the PoET training with the argument Hope this helps you! Best, |
Thank you for your reply! I noticed that there are many versions of Mask R-CNN on GitHub. May I ask which version's weights can be directly loaded with the argument |
You can check it out in the backbone_maskrcnn.py file. As of now you can use the model how it is provided by PyTorch with a ResNet-50 backbone. However, you extend the code to use any object detector backbone you want as long as you return the necessary feature maps and detected object. |
Thank you for your reply! I want to incorporate depth information into the input. Can I use the detection results of an object detection model and fuse them with the output of another backbone network that contains depth information? |
Does the backbone network also contain RGB information? In general you can do that. The object detections do not have to come from the same network that does provide the feature maps. However, I think 6D relative object pose estimation, purely based on depth images might be difficult. On the other hand, combining RGB information with depth information should improve the performance. |
You're right. Due to the presence of objects with similar shapes but varying sizes in my custom dataset, and the uncertainty of scale in monocular RGB images, relying solely on RGB images may yield suboptimal results. Therefore, I intend to fuse depth information with RGB information as input to the network. |
I don't see any limitation regarding the transformer part to process a combination of RGB and depth feature maps. Therefore, if you have a backbone network that produces such feature maps, it should work out! Let me know how it goes! Best, |
Thank you so much for your response. I will definitely try incorporating the RGB and depth feature maps into the Transformer model and see how it performs. I'm excited about the possibilities! |
Definitely, I would be happy to continue the discussion and help you out whenever needed! Best, |
Thank you for providing this excellent project! I would like to train on my custom dataset. In the backbone.py file, I noticed that setting
self[0].train_backbone
toTrue
results in aNotImplementedError
. Does this mean that the backbone is currently not trainable? If I want to use Mask R-CNN as the backbone for training on my custom dataset, what steps should I follow?The text was updated successfully, but these errors were encountered: