You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, Wei
____Great work. If I understand well, more stages give bigger receptive field. So why not just one stage with bigger receptive field?
____You shrink the intermediate outputs the same size as the final output, only for applying the intermediate supervision, and giving it some physical explain. However, all these are meaningless, because mathematically, we don't care the physical explain which can be anything. For example, it can be any size of channels (of course it will be inconvenient to apply the intermediate loss). Even without any intermediate loss, as shown in your experiment (fig.6b), the performance's drop is negligible.
____Another novelty in your paper is skipping the original vgg to different stages. This is a variant of Resnet that skip directly between neighbor stages. Again, the Resnet could bring little improvement (3%) only when it's very deep (>100 layers).
____So let's return to the original question, why not just one stage with bigger receptive field (deeper network)? Honestly, your paper seems focus on something unnecessary... (I beg your pardon.)
I did some fast experiments on 112x112x3 image to 56x56x1 confidence map by combination of your stage1~3, no intermediate inputs, neither intermediate outputs:
Hello, Wei
____Great work. If I understand well, more stages give bigger receptive field. So why not just one stage with bigger receptive field?
____You shrink the intermediate outputs the same size as the final output, only for applying the intermediate supervision, and giving it some physical explain. However, all these are meaningless, because mathematically, we don't care the physical explain which can be anything. For example, it can be any size of channels (of course it will be inconvenient to apply the intermediate loss). Even without any intermediate loss, as shown in your experiment (fig.6b), the performance's drop is negligible.
____Another novelty in your paper is skipping the original vgg to different stages. This is a variant of Resnet that skip directly between neighbor stages. Again, the Resnet could bring little improvement (3%) only when it's very deep (>100 layers).
____So let's return to the original question, why not just one stage with bigger receptive field (deeper network)? Honestly, your paper seems focus on something unnecessary... (I beg your pardon.)
I did some fast experiments on 112x112x3 image to 56x56x1 confidence map by combination of your stage1~3, no intermediate inputs, neither intermediate outputs:
And see no significant performence drop. Could you please have a check?
Hope reply, thanks.
The text was updated successfully, but these errors were encountered: