Please add support to connect falcon and llama models with this . #33

hemangjoshi37a · 2023-09-09T08:26:56Z

Please add support to connect falcon and llama models with this .

hemangjoshi37a · 2023-09-09T08:40:04Z

This has been referenced in #27

j-loquat · 2023-09-10T16:24:44Z

One idea for this is that you could allow up to two local models to be loaded and assigned to one or more agents. We could load one model into the GPU and the other into the CPU with some RAM allocated to it. So say llama2 into the gpu and use it for most of the agents, and then a Python optimized smaller model into cpu for the engineer agent.

andraz · 2023-09-10T18:24:20Z

In theory a long running process could:

accumulate a queue of prompts/tasks for a specific agent
swap the engine to that agent and load it in 10-30s
perform actions and save the queue results of the agent
then prepare queues for other agents from results
repeat at 1.

This would allow us to use big models more efficiently, not accumulating a lot of time penalty for VRAM loading times on swaps.

hemangjoshi37a · 2023-09-10T22:15:10Z

@andraz @j-loquat your solution and suggestions are looking good to implement.

j-loquat · 2023-09-13T14:58:30Z

One thing to consider with local LLM agents is that we should keep the prompts shorter than for OpenAI and reduce the temperature to perhaps lower than 0.5. Lower temp and shorter prompts makes a huge difference in local response times as per GPT4All project.

Alphamasterliu · 2023-10-16T11:01:38Z

Hello, regarding the use of other GPT models or local models, you can refer to the discussion on our GitHub page: #27. Some of these models have corresponding configurations in this Pull Request: #53. You may consider forking the project and giving them a try. While our team currently lacks the time to test every model, it's worth noting that they have received positive feedback and reviews. If you have any other questions, please don't hesitate to ask. We truly appreciate your support and suggestions. We are continuously working to improve more significant features, so please stay tuned.😊

hemangjoshi37a mentioned this issue Sep 9, 2023

Local Models integration #27

Open

thinkwee assigned qianc62 and JiahaoLi2003 Sep 12, 2023

This was referenced Sep 26, 2023

Feature Request: Support for Multiple LLM AI API Endpoints for Self-Hosting and Model Selection #98

Closed

Feature Request: Support for Multiple Simultaneous LLM AI API Endpoints for Self-Hosting and Model Selection microsoft/autogen#34

Closed

Alphamasterliu closed this as completed Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please add support to connect falcon and llama models with this . #33

Please add support to connect falcon and llama models with this . #33

hemangjoshi37a commented Sep 9, 2023

hemangjoshi37a commented Sep 9, 2023

j-loquat commented Sep 10, 2023

andraz commented Sep 10, 2023

hemangjoshi37a commented Sep 10, 2023

j-loquat commented Sep 13, 2023

Alphamasterliu commented Oct 16, 2023

Please add support to connect falcon and llama models with this . #33

Please add support to connect falcon and llama models with this . #33

Comments

hemangjoshi37a commented Sep 9, 2023

hemangjoshi37a commented Sep 9, 2023

j-loquat commented Sep 10, 2023

andraz commented Sep 10, 2023

hemangjoshi37a commented Sep 10, 2023

j-loquat commented Sep 13, 2023

Alphamasterliu commented Oct 16, 2023