-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming Response #1498
Comments
Thanks @TapanKumarBarik for logging this enhancement request. Will forward this request to our engineering team. |
Hi @Vinay-Microsoft and @TapanKumarBarik I was trying to understand why this is not suported as in this example: https://github.com/Azure-Samples/azure-search-openai-demo/tree/main This is possible. When looking to the code code/backend/batch/utilities/helpers/llm_helper.py:
However there seems to be a reference as well: To this environment variable called SHOULD_STREAM. However this env parameter is not mentioned in any of the ones here: I am bit confused if the streaming support is due to the mentioned callback in the comments of the code or if there is need of something else. Even I have no clue of vite Im fairly seasoned with python and I think copilot can do the rest. Could you point me in which direction to go to implement this? BR |
@edgBR i have also faced similar issues as frontend also needs to be adjusted accordingly for streaming response |
Hi @TapanKumarBarik would you mind to share what exactly did you adjust in the frontend? BR |
Sure! Here’s a template filled out for a feature request to enable streaming responses in the RAG (Retrieval-Augmented Generation) chatbot on GitHub:
Motivation
The ability to stream responses in the RAG chatbot will improve the user experience by allowing real-time interaction, especially when handling large or complex queries. This feature is particularly useful when responses take time to generate, ensuring the user sees a response in parts rather than waiting for the entire completion. It would enhance usability by providing immediate feedback, preventing the user from feeling disconnected or uncertain about the status of their request.
Alternatives considered, such as waiting for the full response to be generated before returning it, do not provide the same level of interactivity and responsiveness, which could lead to frustration during long-running queries.
How would you feel if this feature request was implemented?
Requirements
Tasks
This template includes the key aspects for adding streaming functionality. Let me know if you'd like me to refine anything!
The text was updated successfully, but these errors were encountered: