使用tensor_parallelize训练bloom,仅保存了分片之后的参数 #3247
Unanswered
yangjianxin1
asked this question in
Community | Q&A
Replies: 2 comments 1 reply
-
hi~@yangjianxin1 请问你在启用tensor_parallelize策略时,有同时用到lora配置吗,不知道lora是否支持tensor并行训练? |
Beta Was this translation helpful? Give feedback.
1 reply
-
Hi @yangjianxin1 @taishiciR We will update this part next week. #3339 Thanks. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
使用两张gpu训练bloom,并且使用tensor_parallelize策略,调用colossalai.utils.save_checkpoint保存checkpoint时,仅保存了被切分之后一半的模型权重,无法保存完成的模型权重。
Beta Was this translation helpful? Give feedback.
All reactions