Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand to multi-agent scenarios. #28

Open
ewanlee opened this issue Dec 6, 2023 · 2 comments
Open

Expand to multi-agent scenarios. #28

ewanlee opened this issue Dec 6, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@ewanlee
Copy link

ewanlee commented Dec 6, 2023

Hello! May I ask if it is possible to extend the scenario to multiple agents? For example, a task involves two RL agents, with each agent's policy being an LLM. At each timestep, the two agents need to separately access their respective llm server to make decisions, interact with the environment, collect their own data, and ultimately update their own LLM policies.

@ClementRomac
Copy link
Collaborator

Hi,

For now it is not trivial to use multiple LLMs. It would need big changes in the distributed architecture (mostly in the server and dispatcher).

There is however a workaround to achieve similar things (i.e. having multiple weights for the same LLM) using Peft adapters.
As in some of our examples, you can use Peft to add adapters to the LLM and train only these adapters. But Peft actually allows you to add multiple adapters to a model (https://huggingface.co/docs/peft/developer_guides/mixed_models). Note that you need to carefully set the right adapter every time you use a specific agent (which may not be convenient...).

@ClementRomac ClementRomac added the enhancement New feature or request label Jan 2, 2024
@ClementRomac ClementRomac self-assigned this Jan 2, 2024
@ewanlee
Copy link
Author

ewanlee commented Jan 3, 2024

Thank you very much for the suggestion! I'll try to see if different LoRA adapters can temporarily achieve multi-agent tasks😊😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants