You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, first of all, thanks for your awesome codes. This is not about any technical issue, but about the algorithm of the DDPG code.
As far as I know, the DDPG method can exploit online parameter update due to the TD learning. But, in your code, the parameters are updated after an episode is over.
I would like to ask you if there are some theoretical background behind this parameter update interval?
Thank you in advance.
The text was updated successfully, but these errors were encountered:
Hi, first of all, thanks for your awesome codes. This is not about any technical issue, but about the algorithm of the DDPG code.
As far as I know, the DDPG method can exploit online parameter update due to the TD learning. But, in your code, the parameters are updated after an episode is over.
I would like to ask you if there are some theoretical background behind this parameter update interval?
Thank you in advance.
The text was updated successfully, but these errors were encountered: