You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for the innovative paper on using diffusion to solve the detection problem!
I just have one question on the diffusion process proposed in the paper:
When you train the model to denoise the bounding boxes, do you build a correspondence between the noisy boxes and gt? If only Hungary matching is used to build such correspondence, could you help me understand how is your training process different from a normal detection model (for example, sparse rcnn) trained with Hungary matching except that in your work, random boxes are used as the input?
Based on my understanding of diffusion, we add noise to a gt box, then train the model to predict the noise added or the gt box from which the noisy box is derived from. Hungary matching based on my understanding might not link a noisy box to its right gt.
Looking forward to your reply!
Regards,
Barry,
Technical University of Munich
The text was updated successfully, but these errors were encountered:
Based on my understanding of diffusion, we add noise to a gt box, then train the model to predict the noise added or the gt box from which the noisy box is derived from. Hungary matching based on my understanding might not link a noisy box to its right gt.
Definitely, the Hungary matching tries to assign 'num_proposal' predicted (so de-noised) bb to a ground truth, in a 1-to-k manner.
After that, as stated in the paper, a OTA (Optimal Transport Alg.) is performed to keep only the top K choices (default: cfg.MODEL.DiffusionDet.OTA_K = 5 ) of the matcher.
On the top K the loss is performed pair-wise.
Dear Shoufa,
Thank you for the innovative paper on using diffusion to solve the detection problem!
I just have one question on the diffusion process proposed in the paper:
When you train the model to denoise the bounding boxes, do you build a correspondence between the noisy boxes and gt? If only Hungary matching is used to build such correspondence, could you help me understand how is your training process different from a normal detection model (for example, sparse rcnn) trained with Hungary matching except that in your work, random boxes are used as the input?
Based on my understanding of diffusion, we add noise to a gt box, then train the model to predict the noise added or the gt box from which the noisy box is derived from. Hungary matching based on my understanding might not link a noisy box to its right gt.
Looking forward to your reply!
Regards,
Barry,
Technical University of Munich
The text was updated successfully, but these errors were encountered: