Support for Zero3 or Zero3 Offload? Error when loading model state_dict #38

Z-MU-Z · 2024-08-09T08:27:05Z

Hello,

I encountered an error while trying to load a model using the following code in [clip_encoder.py]

self.vision_tower.load_state_dict(torch.load(self.clip_model), strict=False)

The error message is as follows:
[rank0]: raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
[rank0]: RuntimeError: Error(s) in loading state_dict for CLIP:
[rank0]: size mismatch for visual.trunk.stem.0.weight: copying a param with shape torch.Size([192, 3, 4, 4]) from checkpoint, the shape in current model is torch.Size([0])

This happens only when I use scripts/zero3_offload.json or
scripts/zero3.json

The text was updated successfully, but these errors were encountered:

LiWentomng · 2024-08-09T09:20:11Z

@Z-MU-Z
Hello, our code currently does not support Zero3 for model training. We also face some unresolved issues. I recommend using Zero2 for now. We also welcome contributions from the community.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Zero3 or Zero3 Offload? Error when loading model state_dict #38

Support for Zero3 or Zero3 Offload? Error when loading model state_dict #38

Z-MU-Z commented Aug 9, 2024

LiWentomng commented Aug 9, 2024

Support for Zero3 or Zero3 Offload? Error when loading model state_dict #38

Support for Zero3 or Zero3 Offload? Error when loading model state_dict #38

Comments

Z-MU-Z commented Aug 9, 2024

LiWentomng commented Aug 9, 2024