Size mismatches when specifying a different `transformer_backbone` other than `flan-t5-large` #52

julian-fong · 2024-09-11T12:32:37Z

When I specify a transformer_backbone to something other than the google flan t5-large, i get these errors:

size mismatch for encoder.block.11.layer.1.DenseReluDense.wo.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([768, 2048]).
size mismatch for encoder.block.11.layer.1.layer_norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for encoder.final_layer_norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for head.linear.weight: copying a param with shape torch.Size([8, 1024]) from checkpoint, the shape in current model is torch.Size([8, 768]).

How do I fix these errors to load and train the model properly?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Size mismatches when specifying a different `transformer_backbone` other than `flan-t5-large` #52

Size mismatches when specifying a different `transformer_backbone` other than `flan-t5-large` #52

julian-fong commented Sep 11, 2024 •

edited

Loading

Size mismatches when specifying a different transformer_backbone other than flan-t5-large #52

Size mismatches when specifying a different transformer_backbone other than flan-t5-large #52

Comments

julian-fong commented Sep 11, 2024 • edited Loading

Size mismatches when specifying a different `transformer_backbone` other than `flan-t5-large` #52

Size mismatches when specifying a different `transformer_backbone` other than `flan-t5-large` #52

julian-fong commented Sep 11, 2024 •

edited

Loading