You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current Azure OpenAI binding takes in configuration for connecting to a single Azure OpenAI endpoint.
In an number of scenarios it is useful to be able to work with a number of Azure OpenAI endpoints.
Scenario 1 - fail-over
For high-volume usage, customers may purchase a Provisioned Throughput Unit(PTU). In this scenario, the PTU capacity isn't always sufficient for peak-load and a customer might want to send a request to the PTU first and then re-send to a Pay-As-You-Go (PAYG) endpoint if the PTU endpoint returns a 429 response.
Scenario 2 - round-robin
The limits for Azure OpenAI are per-region and customers may set up multiple PAYG endpoints across regions and want to distribute requests between them
Proposal
Sometimes customers with either of the above requirements will set up a gateway in front of the Azure OpenAI endpoints and have that handle the load distribution, but in other cases they come back to the application code to add these capabilities in as the usage scales up.
The proposal is to update the Azure OpenAI binding to allow multiple endpoints to be configured along with a distribution mode (failover or round-robin).
Release Note
RELEASE NOTE: ADD Enable multiple endpoints to be configured in Azure OpenAI binding.
The text was updated successfully, but these errors were encountered:
Describe the feature
The current Azure OpenAI binding takes in configuration for connecting to a single Azure OpenAI endpoint.
In an number of scenarios it is useful to be able to work with a number of Azure OpenAI endpoints.
Scenario 1 - fail-over
For high-volume usage, customers may purchase a Provisioned Throughput Unit(PTU). In this scenario, the PTU capacity isn't always sufficient for peak-load and a customer might want to send a request to the PTU first and then re-send to a Pay-As-You-Go (PAYG) endpoint if the PTU endpoint returns a 429 response.
Scenario 2 - round-robin
The limits for Azure OpenAI are per-region and customers may set up multiple PAYG endpoints across regions and want to distribute requests between them
Proposal
Sometimes customers with either of the above requirements will set up a gateway in front of the Azure OpenAI endpoints and have that handle the load distribution, but in other cases they come back to the application code to add these capabilities in as the usage scales up.
The proposal is to update the Azure OpenAI binding to allow multiple endpoints to be configured along with a distribution mode (
failover
orround-robin
).Release Note
RELEASE NOTE: ADD Enable multiple endpoints to be configured in Azure OpenAI binding.
The text was updated successfully, but these errors were encountered: