Streaming Response #1498

TapanKumarBarik · 2024-11-18T16:07:02Z

Sure! Here’s a template filled out for a feature request to enable streaming responses in the RAG (Retrieval-Augmented Generation) chatbot on GitHub:

Motivation

The ability to stream responses in the RAG chatbot will improve the user experience by allowing real-time interaction, especially when handling large or complex queries. This feature is particularly useful when responses take time to generate, ensuring the user sees a response in parts rather than waiting for the entire completion. It would enhance usability by providing immediate feedback, preventing the user from feeling disconnected or uncertain about the status of their request.

Alternatives considered, such as waiting for the full response to be generated before returning it, do not provide the same level of interactivity and responsiveness, which could lead to frustration during long-running queries.

How would you feel if this feature request was implemented?

Requirements

Implement a streaming API that allows the RAG chatbot to return parts of the response progressively.
Ensure that the frontend (chat interface) can handle partial responses and display them to the user in real-time.
The solution should support incremental content generation, including handling multiple chunks or tokens of the response being sent back.
The backend should support asynchronous streaming, without blocking other processes or interactions.
Add proper error handling for interruptions in streaming or incomplete data.

Tasks

Research streaming capabilities for the existing RAG model and framework.
Implement an API endpoint that supports streaming responses.
Modify the frontend chat interface to render streamed responses dynamically.
Test streaming with sample queries, ensuring the experience is smooth and efficient.
Ensure backward compatibility with non-streaming responses.
Document the new feature for both developers and users.

This template includes the key aspects for adding streaming functionality. Let me know if you'd like me to refine anything!

Vinay-Microsoft · 2024-11-22T18:03:57Z

Thanks @TapanKumarBarik for logging this enhancement request. Will forward this request to our engineering team.

edgBR · 2024-12-19T16:51:05Z

Hi @Vinay-Microsoft and @TapanKumarBarik

I was trying to understand why this is not suported as in this example:

https://github.com/Azure-Samples/azure-search-openai-demo/tree/main

This is possible.

When looking to the code code/backend/batch/utilities/helpers/llm_helper.py:

There seems to be a reference to the need of implementing a custom callback in the UI.

However there seems to be a reference as well:

To this environment variable called SHOULD_STREAM. However this env parameter is not mentioned in any of the ones here:

https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/blob/main/docs/LOCAL_DEPLOYMENT.md

I am bit confused if the streaming support is due to the mentioned callback in the comments of the code or if there is need of something else.

Even I have no clue of vite Im fairly seasoned with python and I think copilot can do the rest. Could you point me in which direction to go to implement this?

BR
E

TapanKumarBarik · 2024-12-25T09:45:40Z

@edgBR i have also faced similar issues as frontend also needs to be adjusted accordingly for streaming response

edgBR · 2024-12-30T07:32:11Z

Hi @TapanKumarBarik would you mind to share what exactly did you adjust in the frontend?

BR
E

TapanKumarBarik added the enhancement New feature or request label Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming Response #1498

Streaming Response #1498

TapanKumarBarik commented Nov 18, 2024

Vinay-Microsoft commented Nov 22, 2024

edgBR commented Dec 19, 2024 •

edited

Loading

TapanKumarBarik commented Dec 25, 2024

edgBR commented Dec 30, 2024

Streaming Response #1498

Streaming Response #1498

Comments

TapanKumarBarik commented Nov 18, 2024

Motivation

How would you feel if this feature request was implemented?

Requirements

Tasks

Vinay-Microsoft commented Nov 22, 2024

edgBR commented Dec 19, 2024 • edited Loading

TapanKumarBarik commented Dec 25, 2024

edgBR commented Dec 30, 2024

edgBR commented Dec 19, 2024 •

edited

Loading