Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The layers within the second and third dense block don't assign the least weight to the outputs of the transition layer in my trained model #53

Open
seasonyc opened this issue Sep 28, 2018 · 0 comments

Comments

@seasonyc
Copy link

I am not sure if it's appropriate to open this issue in github project, this is a question about the heatmap in your paper.

I trained a DenseNet on C10+ with L = 40 and k = 12, which is same as yours , and then I verified the weights on a trained model with 94.6% accuracy, but I didn't get the same result as your observation 3. In my test, the layers within the second and third dense block assign considerable weight to the outputs of the transition layer.

For example, the first conv layer in the second dense block has 0.013281956 average weight on the 1st transition layer output(168 channels, i.e. all the input channels), the second conv layer has 0.011933382 average weight on the 1st transition layer output(first 168 channels), and 0.024417713 average weight on the 12 channels outputted from the first conv layer. This is reasonable because closer channels are more important. The rest layers have similar weights distributions on the old channels and the new channels. And similar condition is in dense block 3.

My densenet and training code is aligned to yours, including augmentation and input norm, see https://github.com/seasonyc/densenet/blob/master/densenet.py and https://github.com/seasonyc/densenet/blob/master/cifar10-test.py. The model file is in https://github.com/seasonyc/densenet/blob/master/dense_augmodel-ep0300-loss0.112-acc0.999-val_loss0.332-val_acc0.946.h5, and my code to count the weights is in https://github.com/seasonyc/densenet/blob/master/weights-verify.py.

I know the models trained in different times are different, even the features of conv filters are different, but I believe the weights distributions are similar in statistics. So although we have different models, we should have similar result.

I did this verification because I feel the observation 3 is a little unreasonable. The 1st conv layer uses the information from the previous dense block very much, and then the 2nd conv layer ignores the information from hundreds of channels but only uses the information from 12 channels, can the 1st conv layer really concentrate hundreds of channels into 12 channels by training?

Do you want to double-check this?

Thanks
YC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant