A question about rlootrainer #2472
Labels
🙋 help from community wanted
Open invitation for community members to contribute
❓ question
Seeking clarification or more information
🏋 RLOO
Related to RLOO
Method description
rlootrainer does not seem to use the self.policy model in train() function. I don't know the meaning of self.policy in the init function.
Open source status
Provide useful links for the implementation
No response
The text was updated successfully, but these errors were encountered: