Feature Request: Support for Multiple Simultaneous LLM AI API Endpoints for Self-Hosting and Model Selection #34

2good4hisowngood · 2023-09-28T12:55:29Z

Description:
We would like to propose the addition of a new feature to AutoGen that enables users to configure and utilize multiple Language Model (LLM) AI API endpoints for self-hosting and experimentation with different models. This feature would enhance the flexibility and versatility of AutoGen for developers and researchers working with LLMs.

Feature Details:

Endpoint Configuration:
- Allow users to store API keys and configure multiple endpoints.
- Implement support for an environment (env) file to securely store sensitive information locally, facilitating scripted pass-through of values. Add to .gitignore
Custom Endpoint Names:
- Provide the ability to assign user-friendly names to each configured endpoint. This helps users easily identify and differentiate between endpoints. It would also allow multiple models to be leveraged on the same endpoint by having different configurations for each. A check could occur to validate that the endpoint has the model expected, and if not, do a quick unload/load of the desired model.
Chat Parameters:
- Integrate settings for chat parameters, such as temperature and other relevant options, that can be adjusted per endpoint. This allows users to fine-tune model behavior.
Model Selection (if applicable):
- If applicable to the specific LLM, enable users to preset a model for each endpoint. This feature can be especially useful when working with multiple LLMs simultaneously.
API Key Management (if applicable):
- For LLM services like OpenAI that require API keys, provide a dedicated parameter in each endpoint for users to input and manage their API keys for each endpoint.
Endpoint Address:
- Allow users to specify the endpoint address (URL) to which API requests should be sent. This flexibility is crucial for self-hosted instances or when working with different LLM providers.
Optional - Endpoint Tagging:

Allowing us to add tags like #code #logic, or #budget could let us give key indicators of where a model's strengths are in, and select from a pool of models with a particular benefit, allowing more diverse outcomes. It could also allow for side-by-side comparisons. This could allow future result tracking/scoring to better identify which models are best at particular features, by having multiple #code models, and testing each's results you can identify and retrain or replace under-performing models to build an optimum workflow.

Expected Benefits:
This feature will benefit developers, researchers, and users who work with LLMs by offering a centralized and user-friendly interface for managing multiple AI API endpoints. It enhances the ability to experiment with various models, configurations, and providers while maintaining security and simplicity. This could allow different characters to leverage specific fine-tuned models rather than the same model for each. This could also allow self-hosted users to experiment with expand the number of repeated looped calls without drastically increasing the bill.

Additional Notes:
Consider implementing an intuitive user interface for configuring and managing these endpoints within the GitHub platform, making it accessible to both novice and experienced users.

References:
Include any relevant resources or references that support the need for this feature, such as the growing popularity of LLMs in various fields and the demand for flexible API management solutions.

Related Issues/Pull Requests:

Assignees:
If you permit this ticket to remain open, I will assemble some links and resources, as well as opening another ticket to handle TextGenWebUI with relevant links there to implementing it. I can try implementing and doing a PR if someone else doesn't get to it first.

Thank you for considering this feature request. I believe that this enhancement will greatly benefit the AutoGen community and its users working with Language Model AI API endpoints.

edit: 9.28
Looking through the repo, it looks like there's a standardized json config, going to look into this next as a method for expanding and holding the features listed above. page found while reading documentation, note near top how it loads the json, then references it further down as : https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat_research.ipynb

Found it https://github.com/microsoft/autogen/blob/main/OAI_CONFIG_LIST_sample

Going to look into how it gets loaded.

2good4hisowngood · 2023-10-06T21:39:15Z

If using textgenwebui locally, it'd be great to be able to switch between models without having to host multiple models simultaneously.

Get models

import requests

HOST = '0.0.0.0:5000'

def model_api(request):
    response = requests.post(f'http://{HOST}/api/v1/model', json=request)
    return response.json()

model_api({'action': 'list'})['result']

Load models

def model_load(model_name):
    return model_api({'action': 'load', 'model_name': model_name})

So something like get, if loaded model = listed model, continue, else load the desired model for the agent.

taoyiran · 2023-11-02T07:14:19Z

Wow, I am deeply appreciative of your work. And I am looking for a way to implement some similar target, such as to connect AutoGen to some LLM model's endpoint like 'QWen-turbo''s online service. May I join your team or at least do something for you guys?

Pakmandesign · 2023-11-03T03:03:31Z

Same! Happy to help.

ImagineL · 2023-11-06T06:59:39Z

@taoyiran tao So am I! It's important for me to support the qwen online server. Feel free to contact me anytime if you need assistance

taoyiran · 2023-11-06T07:46:13Z

@ImagineL Glad to see your message! Now I just study on this project and try to connect AutoGen to Qwen's online service. I will update my status and, if possible, my code here. Thx everyone!

ImagineL · 2023-11-06T08:49:34Z

@taoyiran I'm looking forward to your code sharing! I had to analyze the source code, and it seems hard to resolve if we don't modify the source code. good luck !

weldonla · 2023-11-09T21:54:48Z

In looking for how to do this, I found this thread. I also found the answer. You can just set multiple configurations. It seems as though this feature is already implemented if I understand what the feature request is.

from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager

config_list = [
    {
        "api_type": "open_ai",
        "api_base": "http://localhost:1234/v1",
        "api_key": "NULL"
    }
]
config_list2 = [
    {
        "api_type": "open_ai",
        "api_base": "http://localhost:1235/v1",
        "api_key": "NULL"
    }
]
config_list3 = [
    {
        "api_type": "open_ai",
        "api_base": "http://localhost:1236/v1",
        "api_key": "NULL"
    }
]

llm_config = {
    "config_list": config_list,
    "seed": 47,
    "temperature": 0.5,
    "max_tokens": -1,
    "request_timeout": 6000
}
llm_config2 = {
    "config_list": config_list2,
    "seed": 47,
    "temperature": 0.5,
    "max_tokens": -1,
    "request_timeout": 6000
}
llm_config3 = {
    "config_list": config_list3,
    "seed": 47,
    "temperature": 0.5,
    "max_tokens": -1,
    "request_timeout": 6000
}

user_proxy = UserProxyAgent(
    name="user_proxy",
    system_message="A human admin.",
    max_consecutive_auto_reply=10,
    llm_config=llm_config,
    human_input_mode="ALWAYS"
)

person_1 = AssistantAgent(
    name="person_1",
    system_message="sys_message",
    llm_config=llm_config2,
)

person_2 = AssistantAgent(
    name="person_2",
    system_message="sys_message",
    llm_config=llm_config3,
)

person_3 = AssistantAgent(
    name="person_3",
    system_message="sys_message",
    llm_config=llm_config,
)

groupchat = GroupChat(
    agents=[user_proxy, person_1, person_2, person_3], messages=[]
)
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)

user_proxy.initiate_chat(manager, message="""message""")

ishaan-jaff · 2023-11-29T19:08:12Z

@2good4hisowngood @taoyiran @Pakmandesign @ImagineL @weldonla you can do this using LiteLLM Proxy Server
It can process (500+ requests/second)

Here's the quick start:

Doc: https://docs.litellm.ai/docs/simple_proxy#load-balancing---multiple-instances-of-1-model

Step 1 Create a Config.yaml

model_list:
  - model_name: gpt-4
    litellm_params:
      model: azure/chatgpt-v-2
      api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
      api_version: "2023-05-15"
      api_key: 
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: 
      api_base: https://openai-gpt-4-test-v-2.openai.azure.com/
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: 
      api_base: https://openai-gpt-4-test-v-2.openai.azure.com/

Step 2: Start the litellm proxy:

litellm --config /path/to/config.yaml

Step3 Make Request to LiteLLM proxy:

curl --location 'http://0.0.0.0:8000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "gpt-4",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ],
    }
'

ImagineL · 2023-12-25T16:24:59Z

This looks very reliable, thank you! I'm going to try it!

thinkall · 2024-06-18T03:32:45Z

We are closing this issue due to inactivity; please reopen if the problem persists.

moved the package in notebooks from pyautogen to autogen

* WIP code execution * add tests, reorganize * fix polars test * credit statements * attributions

gagb added the enhancement label Sep 28, 2023

ishaan-jaff mentioned this issue Oct 4, 2023

Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2, CodeLlama, Hugging Face (100+LLMs) - using LiteLLM #95

Closed

3 tasks

2good4hisowngood changed the title ~~Feature Request: Support for Multiple LLM AI API Endpoints for Self-Hosting and Model Selection~~ Feature Request: Support for Multiple Simultaneous LLM AI API Endpoints for Self-Hosting and Model Selection Oct 6, 2023

sonichi added the llm label Oct 22, 2023

thinkall closed this as completed Jun 18, 2024

randombet pushed a commit to randombet/autogen that referenced this issue Sep 9, 2024

Merge pull request microsoft#34 from autogen-ai/move

f10e5cf

moved the package in notebooks from pyautogen to autogen

jackgerrits added a commit that referenced this issue Oct 2, 2024

Add function and code execution (#34)

2dc7af8

* WIP code execution * add tests, reorganize * fix polars test * credit statements * attributions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Support for Multiple Simultaneous LLM AI API Endpoints for Self-Hosting and Model Selection #34

Feature Request: Support for Multiple Simultaneous LLM AI API Endpoints for Self-Hosting and Model Selection #34

2good4hisowngood commented Sep 28, 2023 •

edited

Loading

2good4hisowngood commented Oct 6, 2023 •

edited

Loading

taoyiran commented Nov 2, 2023

Pakmandesign commented Nov 3, 2023

ImagineL commented Nov 6, 2023

taoyiran commented Nov 6, 2023

ImagineL commented Nov 6, 2023

weldonla commented Nov 9, 2023

ishaan-jaff commented Nov 29, 2023

ImagineL commented Dec 25, 2023

thinkall commented Jun 18, 2024

Feature Request: Support for Multiple Simultaneous LLM AI API Endpoints for Self-Hosting and Model Selection #34

Feature Request: Support for Multiple Simultaneous LLM AI API Endpoints for Self-Hosting and Model Selection #34

Comments

2good4hisowngood commented Sep 28, 2023 • edited Loading

2good4hisowngood commented Oct 6, 2023 • edited Loading

taoyiran commented Nov 2, 2023

Pakmandesign commented Nov 3, 2023

ImagineL commented Nov 6, 2023

taoyiran commented Nov 6, 2023

ImagineL commented Nov 6, 2023

weldonla commented Nov 9, 2023

ishaan-jaff commented Nov 29, 2023

Here's the quick start:

Step 1 Create a Config.yaml

Step 2: Start the litellm proxy:

Step3 Make Request to LiteLLM proxy:

ImagineL commented Dec 25, 2023

thinkall commented Jun 18, 2024

2good4hisowngood commented Sep 28, 2023 •

edited

Loading

2good4hisowngood commented Oct 6, 2023 •

edited

Loading