Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in loading pre-trained model to train on multi-gpu #27

Open
yxt132 opened this issue May 31, 2020 · 3 comments
Open

error in loading pre-trained model to train on multi-gpu #27

yxt132 opened this issue May 31, 2020 · 3 comments

Comments

@yxt132
Copy link

yxt132 commented May 31, 2020

I am trying to load the pretrained model to fine tune on multi-gpu setting. However, I am getting an error message . Here is my code:

        checkpoint = torch.load(args.restore_from)
        pretrained_dict = OrderedDict()
        for key, value in checkpoint['state_dict'].items():
            if 'module' in key:
                key = key[7:]
            pretrained_dict[key] = value
        net.load_state_dict(pretrained_dict)

The error message:

RuntimeError: Error(s) in loading state_dict for hlMobileNetV2UNetDecoderIndexLearning:
Missing key(s) in state_dict: "layer0.1._tmp_running_mean", "layer0.1._tmp_running_var", "layer0.1._running_iter", "layer1.0.conv.1._tmp_running_mean", "layer1.0.conv.1._tmp_running_var", "layer1.0.conv.1._running_iter", "layer1.0.conv.4._tmp_running_mean", "layer1.0.conv.4._tmp_running_var", "layer1.0.conv.4._running_iter", "layer2.0.conv.1._tmp_running_mean", "layer2.0.conv.1._tmp_running_var", "layer2.0.conv.1._running_iter", "layer2.0.conv.4._tmp_running_mean", "layer2.0.conv.4._tmp_running_var", "layer2.0.conv.4._running_iter", "layer2.0.conv.7._tmp_running_mean", "layer2.0.conv.7._tmp_running_var", "layer2.0.conv.7._running_iter", "layer2.1.conv.1._tmp_running_mean", "layer2.1.conv.1._tmp_running_var", "layer2.1.conv.1._running_iter", "layer2.1.conv.4._tmp_running_mean", "layer2.1.conv.4._tmp_running_var", "layer2.1.conv.4._running_iter", "layer2.1.conv.7._tmp_running_mean", "layer2.1.conv.7._tmp_running_var", "layer2.1.conv.7._running_iter", "layer3.0.conv.1._tmp_running_mean", "layer3.0.conv.1._tmp_running_var", "layer3.0.conv.1._running_iter", "layer3.0.conv.4._tmp_running_mean", "layer3.0.conv.4._tmp_running_var", "layer3.0.conv.4._running_iter", "layer3.0.conv.7._tmp_running_mean", "layer3.0.conv.7._tmp_running_var", "layer3.0.conv.7._running_iter", "layer3.1.conv.1._tmp_running_mean", "layer3.1.conv.1._tmp_running_var", "layer3.1.conv.1._running_iter", "layer3.1.conv.4._tmp_running_mean", "layer3.1.conv.4._tmp_running_var", "layer3.1.conv.4._running_iter", "layer3.1.conv.7._tmp_running_mean", "layer3.1.conv.7._tmp_running_var", "layer3.1.conv.7._running_iter", "layer3.2.conv.1._tmp_running_mean", "layer3.2.conv.1._tmp_running_var", "layer3.2.conv.1._running_iter", "layer3.2.conv.4._tmp_running_mean", "layer3.2.conv.4._tmp_running_var", "layer3.2.conv.4._running_iter", "layer3.2.conv.7._tmp_running_mean", "layer3.2.conv.7._tmp_running_var", "layer3.2.conv.7._running_iter", "layer4.0.conv.1._tmp_running_mean", "layer4.0.conv.1._tmp_running_var", "layer4.0.conv.1._running_iter", "layer4.0.conv.4._tmp_running_mean", "layer4.0.conv.4._tmp_running_var", "layer4.0.conv.4._running_iter", "layer4.0.conv.7._tmp_running_mean", "layer4.0.conv.7._tmp_running_var", "layer4.0.conv.7._running_iter", "layer4.1.conv.1._tmp_running_mean", "layer4.1.conv.1._tmp_running_var", "layer4.1.conv.1._running_iter", "layer4.1.conv.4._tmp_running_mean", "layer4.1.conv.4._tmp_running_var", "layer4.1.conv.4._running_iter", "layer4.1.conv.7._tmp_running_mean", "layer4.1.conv.7._tmp_running_var", "layer4.1.conv.7._running_iter", "layer4.2.conv.1._tmp_running_mean", "layer4.2.conv.1._tmp_running_var", "layer4.2.conv.1._running_iter", "layer4.2.conv.4._tmp_running_mean", "layer4.2.conv.4._tmp_running_var", "layer4.2.conv.4._running_iter", "layer4.2.conv.7._tmp_running_mean", "layer4.2.conv.7._tmp_running_var", "layer4.2.conv.7._running_iter", "layer4.3.conv.1._tmp_running_mean", "layer4.3.conv.1._tmp_running_var", "layer4.3.conv.1._running_iter", "layer4.3.conv.4._tmp_running_mean", "layer4.3.conv.4._tmp_running_var", "layer4.3.conv.4._running_iter", "layer4.3.conv.7._tmp_running_mean", "layer4.3.conv.7._tmp_running_var", "layer4.3.conv.7._running_iter", "layer5.0.conv.1._tmp_running_mean", "layer5.0.conv.1._tmp_running_var", "layer5.0.conv.1._running_iter", "layer5.0.conv.4._tmp_running_mean", "layer5.0.conv.4._tmp_running_var", "layer5.0.conv.4._running_iter", "layer5.0.conv.7._tmp_running_mean", "layer5.0.conv.7._tmp_running_var", "layer5.0.conv.7._running_iter", "layer5.1.conv.1._tmp_running_mean", "layer5.1.conv.1._tmp_running_var", "layer5.1.conv.1._running_iter", "layer5.1.conv.4._tmp_running_mean", "layer5.1.conv.4._tmp_running_var", "layer5.1.conv.4._running_iter", "layer5.1.conv.7._tmp_running_mean", "layer5.1.conv.7._tmp_running_var", "layer5.1.conv.7._running_iter", "layer5.2.conv.1._tmp_running_mean", "layer5.2.conv.1._tmp_running_var", "layer5.2.conv.1._running_iter", "layer5.2.conv.4._tmp_running_mean", "layer5.2.conv.4._tmp_running_var", "layer5.2.conv.4._running_iter", "layer5.2.conv.7._tmp_running_mean", "layer5.2.conv.7._tmp_running_var", "layer5.2.conv.7._running_iter", "layer6.0.conv.1._tmp_running_mean", "layer6.0.conv.1._tmp_running_var", "layer6.0.conv.1._running_iter", "layer6.0.conv.4._tmp_running_mean", "layer6.0.conv.4._tmp_running_var", "layer6.0.conv.4._running_iter", "layer6.0.conv.7._tmp_running_mean", "layer6.0.conv.7._tmp_running_var", "layer6.0.conv.7._running_iter", "layer6.1.conv.1._tmp_running_mean", "layer6.1.conv.1._tmp_running_var", "layer6.1.conv.1._running_iter", "layer6.1.conv.4._tmp_running_mean", "layer6.1.conv.4._tmp_running_var", "layer6.1.conv.4._running_iter", "layer6.1.conv.7._tmp_running_mean", "layer6.1.conv.7._tmp_running_var", "layer6.1.conv.7._running_iter", "layer6.2.conv.1._tmp_running_mean", "layer6.2.conv.1._tmp_running_var", "layer6.2.conv.1._running_iter", "layer6.2.conv.4._tmp_running_mean", "layer6.2.conv.4._tmp_running_var", "layer6.2.conv.4._running_iter", "layer6.2.conv.7._tmp_running_mean", "layer6.2.conv.7._tmp_running_var", "layer6.2.conv.7._running_iter", "layer7.0.conv.1._tmp_running_mean", "layer7.0.conv.1._tmp_running_var", "layer7.0.conv.1._running_iter", "layer7.0.conv.4._tmp_running_mean", "layer7.0.conv.4._tmp_running_var", "layer7.0.conv.4._running_iter", "layer7.0.conv.7._tmp_running_mean", "layer7.0.conv.7._tmp_running_var", "layer7.0.conv.7._running_iter", "index0.indexnet1.1._tmp_running_mean", "index0.indexnet1.1._tmp_running_var", "index0.indexnet1.1._running_iter", "index0.indexnet2.1._tmp_running_mean", "index0.indexnet2.1._tmp_running_var", "index0.indexnet2.1._running_iter", "index0.indexnet3.1._tmp_running_mean", "index0.indexnet3.1._tmp_running_var", "index0.indexnet3.1._running_iter", "index0.indexnet4.1._tmp_running_mean", "index0.indexnet4.1._tmp_running_var", "index0.indexnet4.1._running_iter", "index2.indexnet1.1._tmp_running_mean", "index2.indexnet1.1._tmp_running_var", "index2.indexnet1.1._running_iter", "index2.indexnet2.1._tmp_running_mean", "index2.indexnet2.1._tmp_running_var", "index2.indexnet2.1._running_iter", "index2.indexnet3.1._tmp_running_mean", "index2.indexnet3.1._tmp_running_var", "index2.indexnet3.1._running_iter", "index2.indexnet4.1._tmp_running_mean", "index2.indexnet4.1._tmp_running_var", "index2.indexnet4.1._running_iter", "index3.indexnet1.1._tmp_running_mean", "index3.indexnet1.1._tmp_running_var", "index3.indexnet1.1._running_iter", "index3.indexnet2.1._tmp_running_mean", "index3.indexnet2.1._tmp_running_var", "index3.indexnet2.1._running_iter", "index3.indexnet3.1._tmp_running_mean", "index3.indexnet3.1._tmp_running_var", "index3.indexnet3.1._running_iter", "index3.indexnet4.1._tmp_running_mean", "index3.indexnet4.1._tmp_running_var", "index3.indexnet4.1._running_iter", "index4.indexnet1.1._tmp_running_mean", "index4.indexnet1.1._tmp_running_var", "index4.indexnet1.1._running_iter", "index4.indexnet2.1._tmp_running_mean", "index4.indexnet2.1._tmp_running_var", "index4.indexnet2.1._running_iter", "index4.indexnet3.1._tmp_running_mean", "index4.indexnet3.1._tmp_running_var", "index4.indexnet3.1._running_iter", "index4.indexnet4.1._tmp_running_mean", "index4.indexnet4.1._tmp_running_var", "index4.indexnet4.1._running_iter", "index6.indexnet1.1._tmp_running_mean", "index6.indexnet1.1._tmp_running_var", "index6.indexnet1.1._running_iter", "index6.indexnet2.1._tmp_running_mean", "index6.indexnet2.1._tmp_running_var", "index6.indexnet2.1._running_iter", "index6.indexnet3.1._tmp_running_mean", "index6.indexnet3.1._tmp_running_var", "index6.indexnet3.1._running_iter", "index6.indexnet4.1._tmp_running_mean", "index6.indexnet4.1._tmp_running_var", "index6.indexnet4.1._running_iter", "dconv_pp.aspp1.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp1.atrous_conv.1._tmp_running_var", "dconv_pp.aspp1.atrous_conv.1._running_iter", "dconv_pp.aspp2.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp2.atrous_conv.1._tmp_running_var", "dconv_pp.aspp2.atrous_conv.1._running_iter", "dconv_pp.aspp2.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp2.atrous_conv.4._tmp_running_var", "dconv_pp.aspp2.atrous_conv.4._running_iter", "dconv_pp.aspp3.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp3.atrous_conv.1._tmp_running_var", "dconv_pp.aspp3.atrous_conv.1._running_iter", "dconv_pp.aspp3.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp3.atrous_conv.4._tmp_running_var", "dconv_pp.aspp3.atrous_conv.4._running_iter", "dconv_pp.aspp4.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp4.atrous_conv.1._tmp_running_var", "dconv_pp.aspp4.atrous_conv.1._running_iter", "dconv_pp.aspp4.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp4.atrous_conv.4._tmp_running_var", "dconv_pp.aspp4.atrous_conv.4._running_iter", "dconv_pp.global_avg_pool.2._tmp_running_mean", "dconv_pp.global_avg_pool.2._tmp_running_var", "dconv_pp.global_avg_pool.2._running_iter", "dconv_pp.bottleneck_conv.1._tmp_running_mean", "dconv_pp.bottleneck_conv.1._tmp_running_var", "dconv_pp.bottleneck_conv.1._running_iter", "decoder_layer6.dconv.1._tmp_running_mean", "decoder_layer6.dconv.1._tmp_running_var", "decoder_layer6.dconv.1._running_iter", "decoder_layer5.dconv.1._tmp_running_mean", "decoder_layer5.dconv.1._tmp_running_var", "decoder_layer5.dconv.1._running_iter", "decoder_layer4.dconv.1._tmp_running_mean", "decoder_layer4.dconv.1._tmp_running_var", "decoder_layer4.dconv.1._running_iter", "decoder_layer3.dconv.1._tmp_running_mean", "decoder_layer3.dconv.1._tmp_running_var", "decoder_layer3.dconv.1._running_iter", "decoder_layer2.dconv.1._tmp_running_mean", "decoder_layer2.dconv.1._tmp_running_var", "decoder_layer2.dconv.1._running_iter", "decoder_layer1.dconv.1._tmp_running_mean", "decoder_layer1.dconv.1._tmp_running_var", "decoder_layer1.dconv.1._running_iter", "decoder_layer0.dconv.1._tmp_running_mean", "decoder_layer0.dconv.1._tmp_running_var", "decoder_layer0.dconv.1._running_iter", "pred.0.1._tmp_running_mean", "pred.0.1._tmp_running_var", "pred.0.1._running_iter".

Any idea on how to resolve this? Thanks

@poppinace
Copy link
Owner

Hi, sorry I just saw your issue. Have you resolved it?

@yxt132
Copy link
Author

yxt132 commented Jun 6, 2020

Nope. I used strict = False to avoid this issue. It seemed it does not affect the results. But I still don't know why those keys are missing.

@poppinace
Copy link
Owner

@yxt132 Hi, I just tried cloned the repository and test the code on my computer. It runs smoothly and does not report the errors you posted. I tested the code in Pytorch 1.2.

BTW, have you modified the code? I presume you didn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants