I'm glad you guys did a great job. When I run bash exp.sh, it doesn't start training. It reported the following error. #2

miaomiaocoder · 2023-11-10T03:17:20Z

You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:231: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  input_ids = torch.tensor(sent[:,0,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:232: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  input_mask = torch.tensor(sent[:,1,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:233: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  segment_ids = torch.tensor(sent[:,2,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:235: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  decoder_input_ids = torch.tensor(title[:,0,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:236: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  decoder_input_mask = torch.tensor(title[:,1,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:237: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  decoder_segment_ids = torch.tensor(title[:,2,:].clone().detach(),dtype=torch.long).cuda()
[unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused215] [unused193] [unused193] [unused193] [unused193] [unused215] [unused193] [unused193] [unused193]

Then the program stopped running.

The text was updated successfully, but these errors were encountered:

miaomiaocoder · 2023-11-10T03:22:38Z

        # input_ids = torch.tensor(sent[:,0,:].clone().detach(),dtype=torch.long).cuda()
        # input_mask = torch.tensor(sent[:,1,:].clone().detach(),dtype=torch.long).cuda()
        # segment_ids = torch.tensor(sent[:,2,:].clone().detach(),dtype=torch.long).cuda()

        # decoder_input_ids = torch.tensor(title[:,0,:].clone().detach(),dtype=torch.long).cuda()
        # decoder_input_mask = torch.tensor(title[:,1,:].clone().detach(),dtype=torch.long).cuda()
        # decoder_segment_ids = torch.tensor(title[:,2,:].clone().detach(),dtype=torch.long).cuda()

        input_ids = sent[:, 0, :].clone().detach().to(torch.long).cuda()
        input_mask = sent[:, 1, :].clone().detach().to(torch.long).cuda()
        segment_ids = sent[:, 2, :].clone().detach().to(torch.long).cuda()

        decoder_input_ids = title[:, 0, :].clone().detach().to(torch.long).cuda()
        decoder_input_mask = title[:, 1, :].clone().detach().to(torch.long).cuda()
        decoder_segment_ids = title[:, 2, :].clone().detach().to(torch.long).cuda()

When I tried to remove the warning, the program still stopped running.

Start to load Faster-RCNN detected objects from data/hm_vgattr5050.tsv
Loaded 20 images in file data/hm_vgattr5050.tsv in 0 seconds.
Use 20 data in torch dataset

Some weights of BertO were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['bert.img_embedding.weight', 'bert.img_embedding.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
UNEXPECTED:  []
MISSING:  ['bert.img_embedding.weight', 'bert.img_embedding.bias']
ERRORS:  []
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.crossattention.bias', 'h.0.crossattention.masked_bias', 'h.0.crossattention.c_attn.weight', 'h.0.crossattention.c_attn.bias', 'h.0.crossattention.q_attn.weight', 'h.0.crossattention.q_attn.bias', 'h.0.crossattention.c_proj.weight', 'h.0.crossattention.c_proj.bias', 'h.0.ln_cross_attn.weight', 'h.0.ln_cross_attn.bias', 'h.1.crossattention.bias', 'h.1.crossattention.masked_bias', 'h.1.crossattention.c_attn.weight', 'h.1.crossattention.c_attn.bias', 'h.1.crossattention.q_attn.weight', 'h.1.crossattention.q_attn.bias', 'h.1.crossattention.c_proj.weight', 'h.1.crossattention.c_proj.bias', 'h.1.ln_cross_attn.weight', 'h.1.ln_cross_attn.bias', 'h.2.crossattention.bias', 'h.2.crossattention.masked_bias', 'h.2.crossattention.c_attn.weight', 'h.2.crossattention.c_attn.bias', 'h.2.crossattention.q_attn.weight', 'h.2.crossattention.q_attn.bias', 'h.2.crossattention.c_proj.weight', 'h.2.crossattention.c_proj.bias', 'h.2.ln_cross_attn.weight', 'h.2.ln_cross_attn.bias', 'h.3.crossattention.bias', 'h.3.crossattention.masked_bias', 'h.3.crossattention.c_attn.weight', 'h.3.crossattention.c_attn.bias', 'h.3.crossattention.q_attn.weight', 'h.3.crossattention.q_attn.bias', 'h.3.crossattention.c_proj.weight', 'h.3.crossattention.c_proj.bias', 'h.3.ln_cross_attn.weight', 'h.3.ln_cross_attn.bias', 'h.4.crossattention.bias', 'h.4.crossattention.masked_bias', 'h.4.crossattention.c_attn.weight', 'h.4.crossattention.c_attn.bias', 'h.4.crossattention.q_attn.weight', 'h.4.crossattention.q_attn.bias', 'h.4.crossattention.c_proj.weight', 'h.4.crossattention.c_proj.bias', 'h.4.ln_cross_attn.weight', 'h.4.ln_cross_attn.bias', 'h.5.crossattention.bias', 'h.5.crossattention.masked_bias', 'h.5.crossattention.c_attn.weight', 'h.5.crossattention.c_attn.bias', 'h.5.crossattention.q_attn.weight', 'h.5.crossattention.q_attn.bias', 'h.5.crossattention.c_proj.weight', 'h.5.crossattention.c_proj.bias', 'h.5.ln_cross_attn.weight', 'h.5.ln_cross_attn.bias', 'h.6.crossattention.bias', 'h.6.crossattention.masked_bias', 'h.6.crossattention.c_attn.weight', 'h.6.crossattention.c_attn.bias', 'h.6.crossattention.q_attn.weight', 'h.6.crossattention.q_attn.bias', 'h.6.crossattention.c_proj.weight', 'h.6.crossattention.c_proj.bias', 'h.6.ln_cross_attn.weight', 'h.6.ln_cross_attn.bias', 'h.7.crossattention.bias', 'h.7.crossattention.masked_bias', 'h.7.crossattention.c_attn.weight', 'h.7.crossattention.c_attn.bias', 'h.7.crossattention.q_attn.weight', 'h.7.crossattention.q_attn.bias', 'h.7.crossattention.c_proj.weight', 'h.7.crossattention.c_proj.bias', 'h.7.ln_cross_attn.weight', 'h.7.ln_cross_attn.bias', 'h.8.crossattention.bias', 'h.8.crossattention.masked_bias', 'h.8.crossattention.c_attn.weight', 'h.8.crossattention.c_attn.bias', 'h.8.crossattention.q_attn.weight', 'h.8.crossattention.q_attn.bias', 'h.8.crossattention.c_proj.weight', 'h.8.crossattention.c_proj.bias', 'h.8.ln_cross_attn.weight', 'h.8.ln_cross_attn.bias', 'h.9.crossattention.bias', 'h.9.crossattention.masked_bias', 'h.9.crossattention.c_attn.weight', 'h.9.crossattention.c_attn.bias', 'h.9.crossattention.q_attn.weight', 'h.9.crossattention.q_attn.bias', 'h.9.crossattention.c_proj.weight', 'h.9.crossattention.c_proj.bias', 'h.9.ln_cross_attn.weight', 'h.9.ln_cross_attn.bias', 'h.10.crossattention.bias', 'h.10.crossattention.masked_bias', 'h.10.crossattention.c_attn.weight', 'h.10.crossattention.c_attn.bias', 'h.10.crossattention.q_attn.weight', 'h.10.crossattention.q_attn.bias', 'h.10.crossattention.c_proj.weight', 'h.10.crossattention.c_proj.bias', 'h.10.ln_cross_attn.weight', 'h.10.ln_cross_attn.bias', 'h.11.crossattention.bias', 'h.11.crossattention.masked_bias', 'h.11.crossattention.c_attn.weight', 'h.11.crossattention.c_attn.bias', 'h.11.crossattention.q_attn.weight', 'h.11.crossattention.q_attn.bias', 'h.11.crossattention.c_proj.weight', 'h.11.crossattention.c_proj.bias', 'h.11.ln_cross_attn.weight', 'h.11.ln_cross_attn.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[unused193] [unused193] [unused82] [unused193] [unused193] [unused193] [unused352] [unused193] [unused193] [unused193] [unused193] [unused215] [unused352] [unused92] [unused82] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused193] [unused352] [unused193] [unused193] [unused352] [unused193] [unused193] [unused193] [unused193]

Look forward to your help.

darthgera123 · 2023-11-10T10:14:21Z

Hi, thanks for the use, I think there seems to be version mismatch of some kind as it looks like the weights are not being loaded properly. You can adapt the same with the newer versions of the models

miaomiaocoder · 2023-12-06T18:39:50Z

Sorry, I still have this problem. Could you tell me the transformers version you used? I keep making this mistake. The environment I used was transformers=3.5. Thank you very much for your help!

(vilio) root@autodl-container-30da119afa-1465d5de:~/autodl-tmp/sum/Multimodal-Summarization# python experiment.py --seed 42 --model O --train train --valid dev_seen --test dev_seen --lr 1e-5 --batchSize 4 --tr bert-base-uncased --epochs 5 --tsv --num_features 36  --contrib --exp O36 --topk 20
Start to load Faster-RCNN detected objects from data/hm_vgattr3636.tsv
Loaded 20 images in file data/hm_vgattr3636.tsv in 0 seconds.
Use 20 data in torch dataset

Some weights of BertO were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['bert.img_embedding.weight', 'bert.img_embedding.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
UNEXPECTED:  []
MISSING:  ['bert.img_embedding.weight', 'bert.img_embedding.bias']
ERRORS:  []
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.crossattention.bias', 'h.0.crossattention.masked_bias', 'h.0.crossattention.c_attn.weight', 'h.0.crossattention.c_attn.bias', 'h.0.crossattention.q_attn.weight', 'h.0.crossattention.q_attn.bias', 'h.0.crossattention.c_proj.weight', 'h.0.crossattention.c_proj.bias', 'h.0.ln_cross_attn.weight', 'h.0.ln_cross_attn.bias', 'h.1.crossattention.bias', 'h.1.crossattention.masked_bias', 'h.1.crossattention.c_attn.weight', 'h.1.crossattention.c_attn.bias', 'h.1.crossattention.q_attn.weight', 'h.1.crossattention.q_attn.bias', 'h.1.crossattention.c_proj.weight', 'h.1.crossattention.c_proj.bias', 'h.1.ln_cross_attn.weight', 'h.1.ln_cross_attn.bias', 'h.2.crossattention.bias', 'h.2.crossattention.masked_bias', 'h.2.crossattention.c_attn.weight', 'h.2.crossattention.c_attn.bias', 'h.2.crossattention.q_attn.weight', 'h.2.crossattention.q_attn.bias', 'h.2.crossattention.c_proj.weight', 'h.2.crossattention.c_proj.bias', 'h.2.ln_cross_attn.weight', 'h.2.ln_cross_attn.bias', 'h.3.crossattention.bias', 'h.3.crossattention.masked_bias', 'h.3.crossattention.c_attn.weight', 'h.3.crossattention.c_attn.bias', 'h.3.crossattention.q_attn.weight', 'h.3.crossattention.q_attn.bias', 'h.3.crossattention.c_proj.weight', 'h.3.crossattention.c_proj.bias', 'h.3.ln_cross_attn.weight', 'h.3.ln_cross_attn.bias', 'h.4.crossattention.bias', 'h.4.crossattention.masked_bias', 'h.4.crossattention.c_attn.weight', 'h.4.crossattention.c_attn.bias', 'h.4.crossattention.q_attn.weight', 'h.4.crossattention.q_attn.bias', 'h.4.crossattention.c_proj.weight', 'h.4.crossattention.c_proj.bias', 'h.4.ln_cross_attn.weight', 'h.4.ln_cross_attn.bias', 'h.5.crossattention.bias', 'h.5.crossattention.masked_bias', 'h.5.crossattention.c_attn.weight', 'h.5.crossattention.c_attn.bias', 'h.5.crossattention.q_attn.weight', 'h.5.crossattention.q_attn.bias', 'h.5.crossattention.c_proj.weight', 'h.5.crossattention.c_proj.bias', 'h.5.ln_cross_attn.weight', 'h.5.ln_cross_attn.bias', 'h.6.crossattention.bias', 'h.6.crossattention.masked_bias', 'h.6.crossattention.c_attn.weight', 'h.6.crossattention.c_attn.bias', 'h.6.crossattention.q_attn.weight', 'h.6.crossattention.q_attn.bias', 'h.6.crossattention.c_proj.weight', 'h.6.crossattention.c_proj.bias', 'h.6.ln_cross_attn.weight', 'h.6.ln_cross_attn.bias', 'h.7.crossattention.bias', 'h.7.crossattention.masked_bias', 'h.7.crossattention.c_attn.weight', 'h.7.crossattention.c_attn.bias', 'h.7.crossattention.q_attn.weight', 'h.7.crossattention.q_attn.bias', 'h.7.crossattention.c_proj.weight', 'h.7.crossattention.c_proj.bias', 'h.7.ln_cross_attn.weight', 'h.7.ln_cross_attn.bias', 'h.8.crossattention.bias', 'h.8.crossattention.masked_bias', 'h.8.crossattention.c_attn.weight', 'h.8.crossattention.c_attn.bias', 'h.8.crossattention.q_attn.weight', 'h.8.crossattention.q_attn.bias', 'h.8.crossattention.c_proj.weight', 'h.8.crossattention.c_proj.bias', 'h.8.ln_cross_attn.weight', 'h.8.ln_cross_attn.bias', 'h.9.crossattention.bias', 'h.9.crossattention.masked_bias', 'h.9.crossattention.c_attn.weight', 'h.9.crossattention.c_attn.bias', 'h.9.crossattention.q_attn.weight', 'h.9.crossattention.q_attn.bias', 'h.9.crossattention.c_proj.weight', 'h.9.crossattention.c_proj.bias', 'h.9.ln_cross_attn.weight', 'h.9.ln_cross_attn.bias', 'h.10.crossattention.bias', 'h.10.crossattention.masked_bias', 'h.10.crossattention.c_attn.weight', 'h.10.crossattention.c_attn.bias', 'h.10.crossattention.q_attn.weight', 'h.10.crossattention.q_attn.bias', 'h.10.crossattention.c_proj.weight', 'h.10.crossattention.c_proj.bias', 'h.10.ln_cross_attn.weight', 'h.10.ln_cross_attn.bias', 'h.11.crossattention.bias', 'h.11.crossattention.masked_bias', 'h.11.crossattention.c_attn.weight', 'h.11.crossattention.c_attn.bias', 'h.11.crossattention.q_attn.weight', 'h.11.crossattention.q_attn.bias', 'h.11.crossattention.c_proj.weight', 'h.11.crossattention.c_proj.bias', 'h.11.ln_cross_attn.weight', 'h.11.ln_cross_attn.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:231: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  input_ids = torch.tensor(sent[:,0,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:232: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  input_mask = torch.tensor(sent[:,1,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:233: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  segment_ids = torch.tensor(sent[:,2,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:235: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  decoder_input_ids = torch.tensor(title[:,0,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:236: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  decoder_input_mask = torch.tensor(title[:,1,:].clone().detach(),dtype=torch.long).cuda()
/root/autodl-tmp/sum/Multimodal-Summarization/oscar.py:237: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  decoder_segment_ids = torch.tensor(title[:,2,:].clone().detach(),dtype=torch.long).cuda()
[unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361] [unused361]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I'm glad you guys did a great job. When I run bash exp.sh, it doesn't start training. It reported the following error. #2

I'm glad you guys did a great job. When I run bash exp.sh, it doesn't start training. It reported the following error. #2

miaomiaocoder commented Nov 10, 2023

miaomiaocoder commented Nov 10, 2023

darthgera123 commented Nov 10, 2023

miaomiaocoder commented Dec 6, 2023

I'm glad you guys did a great job. When I run bash exp.sh, it doesn't start training. It reported the following error. #2

I'm glad you guys did a great job. When I run bash exp.sh, it doesn't start training. It reported the following error. #2

Comments

miaomiaocoder commented Nov 10, 2023

miaomiaocoder commented Nov 10, 2023

darthgera123 commented Nov 10, 2023

miaomiaocoder commented Dec 6, 2023