Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New object #15

Open
kuaileqipaoshui opened this issue May 1, 2024 · 9 comments
Open

New object #15

kuaileqipaoshui opened this issue May 1, 2024 · 9 comments

Comments

@kuaileqipaoshui
Copy link

kuaileqipaoshui commented May 1, 2024

Excuse me, I have a question, is the prediction of the box(ov-det) invalid for the objects other than the 17 objects defined? Because I found that objects will be filtered here(captioner.py-line394), if there are new categories, will they all be judged as others, so that it is impossible to predict?
ll3da
If there's a new object, a fruit, how to predict its position?

@ch3cook-fdu
Copy link
Contributor

The last class of sem_cls_logits is the no object class. Open-vocabulary detection is designed to extend a model’s ability to localize and recognize object beyond a close and pre-defined category set.

@kuaileqipaoshui
Copy link
Author

The last class of sem_cls_logits is the no object class. Open-vocabulary detection is designed to extend a model’s ability to localize and recognize object beyond a close and pre-defined category set.

Is it possible to locate objects in a pre-defined set of classes in ov-det so that the description is generated for these objects only? Wouldn't it be possible to locate a new object without generating a related object description?
If retrain detection, can increase the object category? Such as adding a fruit category

@ch3cook-fdu
Copy link
Contributor

You can just filter and re-label the categories in the generated texts.

@kuaileqipaoshui
Copy link
Author

You can just filter and re-label the categories in the generated texts.

I'm sorry, I didn't understand what you said. For example, if I want to detect the location of a banana in a new scene, can it output the location of a banana like the one in the template?
ll3da1
So when I filter, it will think of the bananas as others categories. If I re-label the categories, such as 18:banana, would it be right?
change self.num_semcls=19?
ll3da2

@ch3cook-fdu
Copy link
Contributor

If you are looking for a grounding model, you can design input text instructions like “locate the banana”.

@kuaileqipaoshui
Copy link
Author

kuaileqipaoshui commented May 2, 2024

If you are looking for a grounding model, you can design input text instructions like “locate the banana”.

The generated answer gives the center of the box and the length, width and height, so how do I visualize the box?
屏幕截图 2024-05-02 152431
How can I reconstructed the 3D box?

@ch3cook-fdu
Copy link
Contributor

Please refer to https://github.com/ch3cook-fdu/3d-pc-box-viz for more visualization functions

@kuaileqipaoshui
Copy link
Author

Please refer to https://github.com/ch3cook-fdu/3d-pc-box-viz for more visualization functions

It's good to see the results of your work. Can you explain in detail how to decode 3D box? I try to decode it, but failed. Looking forward to your reply.

@ch3cook-fdu
Copy link
Contributor

The code for decoding box coordinates can be found in https://github.com/Open3DA/LL3DA/blob/main/eval_utils/evaluate_ovdet.py#L163-L201. Please refer to ch3cook-fdu/Vote2Cap-DETR#11 for visualization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants