Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why not one stage with bigger receptive field? #54

Open
joyousrabbit opened this issue Jun 20, 2017 · 2 comments
Open

Why not one stage with bigger receptive field? #54

joyousrabbit opened this issue Jun 20, 2017 · 2 comments

Comments

@joyousrabbit
Copy link

joyousrabbit commented Jun 20, 2017

Hello, Wei
____Great work. If I understand well, more stages give bigger receptive field. So why not just one stage with bigger receptive field?
____You shrink the intermediate outputs the same size as the final output, only for applying the intermediate supervision, and giving it some physical explain. However, all these are meaningless, because mathematically, we don't care the physical explain which can be anything. For example, it can be any size of channels (of course it will be inconvenient to apply the intermediate loss). Even without any intermediate loss, as shown in your experiment (fig.6b), the performance's drop is negligible.
____Another novelty in your paper is skipping the original vgg to different stages. This is a variant of Resnet that skip directly between neighbor stages. Again, the Resnet could bring little improvement (3%) only when it's very deep (>100 layers).
____So let's return to the original question, why not just one stage with bigger receptive field (deeper network)? Honestly, your paper seems focus on something unnecessary... (I beg your pardon.)

I did some fast experiments on 112x112x3 image to 56x56x1 confidence map by combination of your stage1~3, no intermediate inputs, neither intermediate outputs:

conv0 = layers.conv2d( inputs, 64, [ 5, 5 ], stride = 2)

conv_decode1 = layers.conv2d( conv0, 128, [ 3, 3 ])
conv_decode2 = layers.conv2d( conv_decode1, 128, [ 3, 3 ])
conv_decode3 = layers.conv2d( conv_decode2, 128, [ 3, 3 ])

conv_decode4 = layers.conv2d( conv_decode3, 128, [ 7, 7 ])
conv_decode5 = layers.conv2d( conv_decode4, 128, [ 7, 7 ])
conv_decode6 = layers.conv2d( conv_decode5, 128, [ 7, 7 ])
conv_decode7 = layers.conv2d( conv_decode6, 128, [ 7, 7 ])
conv_decode8 = layers.conv2d( conv_decode7, 128, [ 7, 7 ])

conv_decode9 = layers.conv2d( conv_decode8, 128, [ 7, 7 ])
conv_decode10 = layers.conv2d( conv_decode9, 128, [ 7, 7 ])
conv_decode11 = layers.conv2d( conv_decode10, 128, [ 7, 7 ])
conv_decode12 = layers.conv2d( conv_decode11, 128, [ 7, 7 ])
conv_decode13 = layers.conv2d( conv_decode12, 128, [ 7, 7 ])

conv_decode_final_1 = layers.conv2d( conv_decode13, 128, [ 1, 1 ])
conv_decode_final_2 = layers.conv2d( conv_decode_final_1, 1, [ 1, 1 ], activation_fn=None)

And see no significant performence drop. Could you please have a check?
Hope reply, thanks.

@feitiandemiaomi
Copy link

@joyousrabbit Hi,Thanks for your clear method, I woder if you make a strong and light model ?

@Ai-is-light
Copy link

@ @joyousrabbit , would you mind showing your work? We hope the faster and lighter model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants