[Question] Reward net transfer #838

risufaj · 2024-02-02T23:21:59Z

Hello,

I want to run IRL on a task with some expert demonstrations. The demonstrations are a bit old, and since then, the action space action has increased. For instance, in the first version of the task there were only 5 actions, whereas in the new version there are 3 new actions that can be taken.
Is it possible to train a reward net using the existing expert demonstrations (e.g. using AIRL) and then used the trained reward net to train a new policy considering the added actions? If so, I'm not entirely sure how it would look like when creating a RewardNet class.

I would appreciate some help.

Thanks in advance.

The text was updated successfully, but these errors were encountered:

rizqisubeno · 2024-09-20T07:55:04Z

I think you can use again reward net again to train a new policy with added actions as long as you use a state-only parameter on the reward net

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Reward net transfer #838

[Question] Reward net transfer #838

risufaj commented Feb 2, 2024

rizqisubeno commented Sep 20, 2024

[Question] Reward net transfer #838

[Question] Reward net transfer #838

Comments

risufaj commented Feb 2, 2024

rizqisubeno commented Sep 20, 2024