New object #15

kuaileqipaoshui · 2024-05-01T01:55:12Z

Excuse me, I have a question, is the prediction of the box(ov-det) invalid for the objects other than the 17 objects defined? Because I found that objects will be filtered here(captioner.py-line394), if there are new categories, will they all be judged as others, so that it is impossible to predict?

If there's a new object, a fruit, how to predict its position?

ch3cook-fdu · 2024-05-01T02:17:32Z

The last class of sem_cls_logits is the no object class. Open-vocabulary detection is designed to extend a model’s ability to localize and recognize object beyond a close and pre-defined category set.

kuaileqipaoshui · 2024-05-01T02:26:17Z

The last class of sem_cls_logits is the no object class. Open-vocabulary detection is designed to extend a model’s ability to localize and recognize object beyond a close and pre-defined category set.

Is it possible to locate objects in a pre-defined set of classes in ov-det so that the description is generated for these objects only? Wouldn't it be possible to locate a new object without generating a related object description?
If retrain detection, can increase the object category? Such as adding a fruit category

ch3cook-fdu · 2024-05-01T03:51:00Z

You can just filter and re-label the categories in the generated texts.

kuaileqipaoshui · 2024-05-01T05:20:50Z

You can just filter and re-label the categories in the generated texts.

I'm sorry, I didn't understand what you said. For example, if I want to detect the location of a banana in a new scene, can it output the location of a banana like the one in the template?

So when I filter, it will think of the bananas as others categories. If I re-label the categories, such as 18:banana, would it be right?
change self.num_semcls=19?

ch3cook-fdu · 2024-05-01T05:25:58Z

If you are looking for a grounding model, you can design input text instructions like “locate the banana”.

kuaileqipaoshui · 2024-05-02T07:29:51Z

If you are looking for a grounding model, you can design input text instructions like “locate the banana”.

The generated answer gives the center of the box and the length, width and height, so how do I visualize the box?

How can I reconstructed the 3D box?

ch3cook-fdu · 2024-05-02T11:31:01Z

Please refer to https://github.com/ch3cook-fdu/3d-pc-box-viz for more visualization functions

kuaileqipaoshui · 2024-05-04T14:15:33Z

Please refer to https://github.com/ch3cook-fdu/3d-pc-box-viz for more visualization functions

It's good to see the results of your work. Can you explain in detail how to decode 3D box? I try to decode it, but failed. Looking forward to your reply.

ch3cook-fdu · 2024-05-05T02:50:25Z

The code for decoding box coordinates can be found in https://github.com/Open3DA/LL3DA/blob/main/eval_utils/evaluate_ovdet.py#L163-L201. Please refer to ch3cook-fdu/Vote2Cap-DETR#11 for visualization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New object #15

New object #15

kuaileqipaoshui commented May 1, 2024 •

edited

Loading

ch3cook-fdu commented May 1, 2024

kuaileqipaoshui commented May 1, 2024

ch3cook-fdu commented May 1, 2024

kuaileqipaoshui commented May 1, 2024

ch3cook-fdu commented May 1, 2024

kuaileqipaoshui commented May 2, 2024 •

edited

Loading

ch3cook-fdu commented May 2, 2024

kuaileqipaoshui commented May 4, 2024

ch3cook-fdu commented May 5, 2024

New object #15

New object #15

Comments

kuaileqipaoshui commented May 1, 2024 • edited Loading

ch3cook-fdu commented May 1, 2024

kuaileqipaoshui commented May 1, 2024

ch3cook-fdu commented May 1, 2024

kuaileqipaoshui commented May 1, 2024

ch3cook-fdu commented May 1, 2024

kuaileqipaoshui commented May 2, 2024 • edited Loading

ch3cook-fdu commented May 2, 2024

kuaileqipaoshui commented May 4, 2024

ch3cook-fdu commented May 5, 2024

kuaileqipaoshui commented May 1, 2024 •

edited

Loading

kuaileqipaoshui commented May 2, 2024 •

edited

Loading