Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use exponential backoff and handle 429 status codes #58

Open
AaronFriel opened this issue Oct 4, 2023 · 2 comments
Open

Use exponential backoff and handle 429 status codes #58

AaronFriel opened this issue Oct 4, 2023 · 2 comments
Labels
kind/enhancement Improvements or new features

Comments

@AaronFriel
Copy link

AaronFriel commented Oct 4, 2023

The OpenAI backend may rate limit clients by API key or organization, returning a 429 status code.

Currently this bubbles up as an error in the CLI.

@krrishdholakia
Copy link

hey @AaronFriel, I'm the maintainer of LiteLLM we allow you to create a Router to maximize throughput by load balancing + queuing (beta).

I'd love to get your feedback if this solves your issue

Here's the quick start

from litellm import Router

model_list = [{ # list of model deployments 
    "model_name": "gpt-3.5-turbo", # model alias 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-v-2", # actual model name
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-functioncalling", 
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "gpt-3.5-turbo", 
        "api_key": os.getenv("OPENAI_API_KEY"),
    }
}]

router = Router(model_list=model_list)

# openai.ChatCompletion.create replacement
response = await router.acompletion(model="gpt-3.5-turbo", 
                messages=[{"role": "user", "content": "Hey, how's it going?"}])

print(response)

@AaronFriel
Copy link
Author

Hey @krrishdholakia! Appreciate the suggestion here, our backend is written in Node so I don't think we can use your solution as is, but great work on a useful product!

@mikhailshilkov mikhailshilkov added the kind/enhancement Improvements or new features label Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Improvements or new features
Projects
None yet
Development

No branches or pull requests

3 participants