What will happen to the API when load balancing is enabled and rate limiting is enabled? #8314

yuyongkratos · 2024-09-12T09:51:33Z

yuyongkratos
Sep 12, 2024

Self Checks

I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

I want to enable the load balancing function of the language model in Dify, but I don't know the specific working mode.
My current situation is that there are many different businesses that need to call the large language model, such as: Q&A, article translation, information extraction, etc.; due to frequent business calls, the large model service will often be suspended.
So, I plan to enable the Model Load Balancing function, but I don't know the detailed working mode.
But I deployed two local models Model_1 and Model_2, they belong to qwen2-72b-chat, and many APIs provided by Dify use qwen2-72b-chat;
I have the following questions:

A Q&A request is using Model_1, and at this time a translation request is requesting the large model. Will this translation request request Model_2?
When both Model_1 and Model_2 are in use, what will happen if another request comes in? Request rejection or waiting?

2. Additional context or comments

I found a link about load balancing, but there is no more detailed introduction about the effect of load balancing.
The link is as follows:
load-balancing[https://docs.dify.ai/v/zh-hans/guides/model-configuration/load-balancing]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What will happen to the API when load balancing is enabled and rate limiting is enabled? #8314

{{title}}

Replies: 0 comments

Select a reply

What will happen to the API when load balancing is enabled and rate limiting is enabled? #8314

yuyongkratos Sep 12, 2024

Self Checks

1. Is this request related to a challenge you're experiencing? Tell me about your story.

2. Additional context or comments

Replies: 0 comments

yuyongkratos
Sep 12, 2024