Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you help me understand the diffusion process? #93

Open
ChenBarryHu opened this issue Aug 24, 2023 · 1 comment
Open

Could you help me understand the diffusion process? #93

ChenBarryHu opened this issue Aug 24, 2023 · 1 comment

Comments

@ChenBarryHu
Copy link

Dear Shoufa,

Thank you for the innovative paper on using diffusion to solve the detection problem!

I just have one question on the diffusion process proposed in the paper:

When you train the model to denoise the bounding boxes, do you build a correspondence between the noisy boxes and gt? If only Hungary matching is used to build such correspondence, could you help me understand how is your training process different from a normal detection model (for example, sparse rcnn) trained with Hungary matching except that in your work, random boxes are used as the input?

Based on my understanding of diffusion, we add noise to a gt box, then train the model to predict the noise added or the gt box from which the noisy box is derived from. Hungary matching based on my understanding might not link a noisy box to its right gt.

Looking forward to your reply!

Regards,
Barry,
Technical University of Munich

@MatteoMele98
Copy link

MatteoMele98 commented Aug 31, 2023

Hi! I'm trying to help:

Based on my understanding of diffusion, we add noise to a gt box, then train the model to predict the noise added or the gt box from which the noisy box is derived from. Hungary matching based on my understanding might not link a noisy box to its right gt.

Definitely, the Hungary matching tries to assign 'num_proposal' predicted (so de-noised) bb to a ground truth, in a 1-to-k manner.
After that, as stated in the paper, a OTA (Optimal Transport Alg.) is performed to keep only the top K choices (default: cfg.MODEL.DiffusionDet.OTA_K = 5 ) of the matcher.
On the top K the loss is performed pair-wise.

Hope you find it helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants