-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
大佬,训练loss=nan.0 是什么情况? #17
Comments
你这个loss 是一开始就是nan吗, 还是训练过程中出现nan , 试一下 adamw |
好的 一开始就是 nan 我试下adamw, 还有这个./zero_to_fp32.py 文件我怎么没找到在哪里 |
lora 不需要转换权重了。 全参数开启deepspeed 才需要。 |
我用的ptv2,我看你代码里面 int4模型不支持lora |
ptv2 权重 也不用转了, trainer 精度改成 32试试 |
好的 我试试 |
请问这个问题有解决方案吗,我也遇到了loss为nan的情况,把precision调为32过后报错: 我修改的配置文件包括: sft_config_ptv2.py: |
The text was updated successfully, but these errors were encountered: