You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I am trying to implement ICM in PPO with both extrinsic and intrinsic combination. I have seen in few repos where they weight out an extrinsic reward more than intrinsic i.e. combine_reward = (1-int_coef) * rewards + int_coef * intrinsic_reward whereint_coeff = 0.01which reduces the effect of intrinsic rewards significantly. Seeing your paper, you have nowhere mentioned this sort of equation for both the rewards. I wonder if you can tell me that the equation mentioned above can be implemented for a dual reward setting.
The text was updated successfully, but these errors were encountered:
Hello, I am trying to implement ICM in PPO with both extrinsic and intrinsic combination. I have seen in few repos where they weight out an extrinsic reward more than intrinsic i.e.
combine_reward = (1-int_coef) * rewards + int_coef * intrinsic_reward
whereint_coeff = 0.01
which reduces the effect of intrinsic rewards significantly. Seeing your paper, you have nowhere mentioned this sort of equation for both the rewards. I wonder if you can tell me that the equation mentioned above can be implemented for a dual reward setting.The text was updated successfully, but these errors were encountered: