Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于SAC_Continuous的Actor部分的问题 #9

Open
lgmtxl opened this issue Aug 29, 2024 · 0 comments
Open

关于SAC_Continuous的Actor部分的问题 #9

lgmtxl opened this issue Aug 29, 2024 · 0 comments

Comments

@lgmtxl
Copy link

lgmtxl commented Aug 29, 2024

在您这行代码中:

logp_pi_a = dist.log_prob(u).sum(axis=1, keepdim=True) - (2 * (np.log(2) - u - F.softplus(-2 * u))).sum(axis=1, keepdim=True)

为什么对于第一项的dist.log_prob(u),在dim=1的维度上进行sum操作?按照原来的公式,似乎这一项并不是sum操作吧?
请您指正!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant