Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming Response #1498

Open
6 tasks
TapanKumarBarik opened this issue Nov 18, 2024 · 4 comments
Open
6 tasks

Streaming Response #1498

TapanKumarBarik opened this issue Nov 18, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@TapanKumarBarik
Copy link

Sure! Here’s a template filled out for a feature request to enable streaming responses in the RAG (Retrieval-Augmented Generation) chatbot on GitHub:


Motivation

The ability to stream responses in the RAG chatbot will improve the user experience by allowing real-time interaction, especially when handling large or complex queries. This feature is particularly useful when responses take time to generate, ensuring the user sees a response in parts rather than waiting for the entire completion. It would enhance usability by providing immediate feedback, preventing the user from feeling disconnected or uncertain about the status of their request.

Alternatives considered, such as waiting for the full response to be generated before returning it, do not provide the same level of interactivity and responsiveness, which could lead to frustration during long-running queries.


How would you feel if this feature request was implemented?

Excited


Requirements

  • Implement a streaming API that allows the RAG chatbot to return parts of the response progressively.
  • Ensure that the frontend (chat interface) can handle partial responses and display them to the user in real-time.
  • The solution should support incremental content generation, including handling multiple chunks or tokens of the response being sent back.
  • The backend should support asynchronous streaming, without blocking other processes or interactions.
  • Add proper error handling for interruptions in streaming or incomplete data.

Tasks

  • Research streaming capabilities for the existing RAG model and framework.
  • Implement an API endpoint that supports streaming responses.
  • Modify the frontend chat interface to render streamed responses dynamically.
  • Test streaming with sample queries, ensuring the experience is smooth and efficient.
  • Ensure backward compatibility with non-streaming responses.
  • Document the new feature for both developers and users.

This template includes the key aspects for adding streaming functionality. Let me know if you'd like me to refine anything!

@TapanKumarBarik TapanKumarBarik added the enhancement New feature or request label Nov 18, 2024
@Vinay-Microsoft
Copy link

Thanks @TapanKumarBarik for logging this enhancement request. Will forward this request to our engineering team.

@edgBR
Copy link

edgBR commented Dec 19, 2024

Hi @Vinay-Microsoft and @TapanKumarBarik

I was trying to understand why this is not suported as in this example:

https://github.com/Azure-Samples/azure-search-openai-demo/tree/main

This is possible.

When looking to the code code/backend/batch/utilities/helpers/llm_helper.py:

image
There seems to be a reference to the need of implementing a custom callback in the UI.

However there seems to be a reference as well:

image

To this environment variable called SHOULD_STREAM. However this env parameter is not mentioned in any of the ones here:

https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/blob/main/docs/LOCAL_DEPLOYMENT.md

I am bit confused if the streaming support is due to the mentioned callback in the comments of the code or if there is need of something else.

Even I have no clue of vite Im fairly seasoned with python and I think copilot can do the rest. Could you point me in which direction to go to implement this?

BR
E

@TapanKumarBarik
Copy link
Author

@edgBR i have also faced similar issues as frontend also needs to be adjusted accordingly for streaming response

@edgBR
Copy link

edgBR commented Dec 30, 2024

Hi @TapanKumarBarik would you mind to share what exactly did you adjust in the frontend?

BR
E

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants