You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! May I ask if it is possible to extend the scenario to multiple agents? For example, a task involves two RL agents, with each agent's policy being an LLM. At each timestep, the two agents need to separately access their respective llm server to make decisions, interact with the environment, collect their own data, and ultimately update their own LLM policies.
The text was updated successfully, but these errors were encountered:
For now it is not trivial to use multiple LLMs. It would need big changes in the distributed architecture (mostly in the server and dispatcher).
There is however a workaround to achieve similar things (i.e. having multiple weights for the same LLM) using Peft adapters.
As in some of our examples, you can use Peft to add adapters to the LLM and train only these adapters. But Peft actually allows you to add multiple adapters to a model (https://huggingface.co/docs/peft/developer_guides/mixed_models). Note that you need to carefully set the right adapter every time you use a specific agent (which may not be convenient...).
Hello! May I ask if it is possible to extend the scenario to multiple agents? For example, a task involves two RL agents, with each agent's policy being an LLM. At each timestep, the two agents need to separately access their respective llm server to make decisions, interact with the environment, collect their own data, and ultimately update their own LLM policies.
The text was updated successfully, but these errors were encountered: