Why not one stage with bigger receptive field? #54

joyousrabbit · 2017-06-20T07:24:59Z

Hello, Wei
____Great work. If I understand well, more stages give bigger receptive field. So why not just one stage with bigger receptive field?
____You shrink the intermediate outputs the same size as the final output, only for applying the intermediate supervision, and giving it some physical explain. However, all these are meaningless, because mathematically, we don't care the physical explain which can be anything. For example, it can be any size of channels (of course it will be inconvenient to apply the intermediate loss). Even without any intermediate loss, as shown in your experiment (fig.6b), the performance's drop is negligible.
____Another novelty in your paper is skipping the original vgg to different stages. This is a variant of Resnet that skip directly between neighbor stages. Again, the Resnet could bring little improvement (3%) only when it's very deep (>100 layers).
____So let's return to the original question, why not just one stage with bigger receptive field (deeper network)? Honestly, your paper seems focus on something unnecessary... (I beg your pardon.)

I did some fast experiments on 112x112x3 image to 56x56x1 confidence map by combination of your stage1~3, no intermediate inputs, neither intermediate outputs:

conv0 = layers.conv2d( inputs, 64, [ 5, 5 ], stride = 2)

conv_decode1 = layers.conv2d( conv0, 128, [ 3, 3 ])
conv_decode2 = layers.conv2d( conv_decode1, 128, [ 3, 3 ])
conv_decode3 = layers.conv2d( conv_decode2, 128, [ 3, 3 ])

conv_decode4 = layers.conv2d( conv_decode3, 128, [ 7, 7 ])
conv_decode5 = layers.conv2d( conv_decode4, 128, [ 7, 7 ])
conv_decode6 = layers.conv2d( conv_decode5, 128, [ 7, 7 ])
conv_decode7 = layers.conv2d( conv_decode6, 128, [ 7, 7 ])
conv_decode8 = layers.conv2d( conv_decode7, 128, [ 7, 7 ])

conv_decode9 = layers.conv2d( conv_decode8, 128, [ 7, 7 ])
conv_decode10 = layers.conv2d( conv_decode9, 128, [ 7, 7 ])
conv_decode11 = layers.conv2d( conv_decode10, 128, [ 7, 7 ])
conv_decode12 = layers.conv2d( conv_decode11, 128, [ 7, 7 ])
conv_decode13 = layers.conv2d( conv_decode12, 128, [ 7, 7 ])

conv_decode_final_1 = layers.conv2d( conv_decode13, 128, [ 1, 1 ])
conv_decode_final_2 = layers.conv2d( conv_decode_final_1, 1, [ 1, 1 ], activation_fn=None)

And see no significant performence drop. Could you please have a check?
Hope reply, thanks.

The text was updated successfully, but these errors were encountered:

feitiandemiaomi · 2017-09-15T07:45:48Z

@joyousrabbit Hi，Thanks for your clear method, I woder if you make a strong and light model ?

Ai-is-light · 2018-01-03T08:44:18Z

@ @joyousrabbit , would you mind showing your work? We hope the faster and lighter model

joyousrabbit mentioned this issue Jun 22, 2017

is it possible to use this algorithm in mobile advises like android? #45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why not one stage with bigger receptive field? #54

Why not one stage with bigger receptive field? #54

joyousrabbit commented Jun 20, 2017 •

edited

Loading

feitiandemiaomi commented Sep 15, 2017

Ai-is-light commented Jan 3, 2018

Why not one stage with bigger receptive field? #54

Why not one stage with bigger receptive field? #54

Comments

joyousrabbit commented Jun 20, 2017 • edited Loading

feitiandemiaomi commented Sep 15, 2017

Ai-is-light commented Jan 3, 2018

joyousrabbit commented Jun 20, 2017 •

edited

Loading