We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log_ratio = (logprobs - old_logprobs) * mask ratio = torch.exp(log_ratio.float())
logprobs和old_logprobs不一定谁大,所以 这里log_ratio可能为正或者为负吧,
pg_loss1 = -advantages * ratio pg_loss2 = -advantages * torch.clamp( ratio, 1.0 - self.args.cliprange, 1.0 + self.args.cliprange, )
所以这里pg_loss1也可能是正也可能是负吧。
The text was updated successfully, but these errors were encountered:
No branches or pull requests
logprobs和old_logprobs不一定谁大,所以
这里log_ratio可能为正或者为负吧,
所以这里pg_loss1也可能是正也可能是负吧。
The text was updated successfully, but these errors were encountered: