Use exponential backoff and handle 429 status codes #58

AaronFriel · 2023-10-04T19:30:43Z

The OpenAI backend may rate limit clients by API key or organization, returning a 429 status code.

Currently this bubbles up as an error in the CLI.

krrishdholakia · 2023-11-27T17:44:20Z

hey @AaronFriel, I'm the maintainer of LiteLLM we allow you to create a Router to maximize throughput by load balancing + queuing (beta).

I'd love to get your feedback if this solves your issue

Here's the quick start

from litellm import Router

model_list = [{ # list of model deployments 
    "model_name": "gpt-3.5-turbo", # model alias 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-v-2", # actual model name
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-functioncalling", 
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "gpt-3.5-turbo", 
        "api_key": os.getenv("OPENAI_API_KEY"),
    }
}]

router = Router(model_list=model_list)

# openai.ChatCompletion.create replacement
response = await router.acompletion(model="gpt-3.5-turbo", 
                messages=[{"role": "user", "content": "Hey, how's it going?"}])

print(response)

AaronFriel · 2023-11-27T20:38:39Z

Hey @krrishdholakia! Appreciate the suggestion here, our backend is written in Node so I don't think we can use your solution as is, but great work on a useful product!

mikhailshilkov added the kind/enhancement Improvements or new features label Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use exponential backoff and handle 429 status codes #58

Use exponential backoff and handle 429 status codes #58

AaronFriel commented Oct 4, 2023 •

edited

Loading

krrishdholakia commented Nov 27, 2023

AaronFriel commented Nov 27, 2023

Use exponential backoff and handle 429 status codes #58

Use exponential backoff and handle 429 status codes #58

Comments

AaronFriel commented Oct 4, 2023 • edited Loading

krrishdholakia commented Nov 27, 2023

AaronFriel commented Nov 27, 2023

AaronFriel commented Oct 4, 2023 •

edited

Loading