Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

当我运行‘bash base_train.sh’时遇到的问题 #17

Open
tangjiaxi98 opened this issue Dec 21, 2021 · 2 comments
Open

当我运行‘bash base_train.sh’时遇到的问题 #17

tangjiaxi98 opened this issue Dec 21, 2021 · 2 comments

Comments

@tangjiaxi98
Copy link

2021-12-19 07:57:32,080 maskrcnn_benchmark.utils.checkpoint INFO: Saving checkpoint to ./model_final.pth
/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py:25: UserWarning: An input tensor was not cuda.
warnings.warn("An input tensor was not cuda.")
Traceback (most recent call last):
File "../../tools/train_net.py", line 213, in
main()
File "../../tools/train_net.py", line 206, in main
model = train(cfg, args.local_rank, args.distributed, phase, shot, split)
File "../../tools/train_net.py", line 97, in train
arguments
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/engine/trainer.py", line 149, in do_train
attentions = model(images, targets, meta_input, meta_label,average_shot=True)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd
**applier(kwargs, input_caster))
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 107, in forward
attentions = self.meta_extractor(meta_input,dr=self.dense_relation)
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 83, in meta_extractor
base_feat = self.backbone((meta_data,1))[2]
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/modeling/backbone/resnet.py", line 148, in forward
x = self.stem(x,meta=meta)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/modeling/backbone/resnet.py", line 366, in forward
x = self.conv2(x)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/layers/misc.py", line 33, in forward
return super(Conv2d, self).forward(x)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 338, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same
Traceback (most recent call last):
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/distributed/launch.py", line 235, in
main()
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/distributed/launch.py", line 231, in main
cmd=process.args)
subprocess.CalledProcessError: Command '['/home/hp/anaconda3/envs/dcnet/bin/python', '-u', '../../tools/train_net.py', '--local_rank=0', '--config-file', 'configs/base/e2e_voc_split3_base.yaml']' returned non-zero exit status 1.
mv: 无法获取'inference/voc_2007_test_split3_base/result.txt' 的文件状态(stat): 没有那个文件或目录

请问存储在inference文件夹下的是一些什么文件?我的inference文件夹下是空的。
还有,请问为什么储存完model_final.pth后报了如上的错误,该怎么解决。
希望您的回复,谢谢。

@EmberaThomas
Copy link

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

Same problem and after 800 epochs my loss is nan

@Zhengfei-0311
Copy link

我也是同样的错误

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants