-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Conversations and Generative AI in OpenSearch #1150
Comments
There is an important nuance to this statement: her conversations’ access controls are maximally the intersection of Alice’s access rights at the time of the interaction. If Alice's permissions change from the time of the interaction in a way that makes some of the captured information off-limits to her, this access control will no longer be appropriate. We don't currently have the needed security primitives/functionality to support this level of access control on derived data natively (it's in our radar though!), so limiting access to the interactions to the owning user is the best we can do without them. With that said, am I correct in interpreting that the indices that will hold the Conversations and Interactions will be restricted to just the plugin and all access to the data gated by the new API? If so, they are missing a field with the user who owns them so that we can enforce that access control. |
This is a very thorough RFC. Thanks, Austin.
Confirming that Search Pipelines is only available in 2.9+. When you say "...hoping to get the pipelines themselves into Search-Processors..." do you mean the search-processor GH repo? That repo has two processors that we will eventually factor out into separate repos. Our current thinking on search processors is that they can be included in core (https://github.com/opensearch-project/OpenSearch) if they have no external dependencies. If there are dependencies, a separate repo as a self-install plugin is the right approach. Some of this may belong in ml-commons, but I would leave that up to the maintainers of this repo. You may also need to build a search processor that is ALSO a plugin to gain access to resources via the plugin interface. It may be necessary to build a processor that is also a plugin to access the conversation memory, for example. One analogy for search pipelines is to think about it like piping together *NIX commands. Each command (processor in pipeline speak) can be as complex as needed, but still really only does one thing, and then you compose functionality by sending the stdin (request in pipeline speak) or stdout (response in pipeline speak) from one processor to the next. Some of the ones needed may end up in core; some may end up in a separate repo. cc: @msfroh |
About Conversation API, it looks like only wrap up a search pipeline inside, I think the APIs of search pipeline work well already. Meanwhile I think conversation function can support not only search application but also others applications like chatbot etc. That means the conversation function can help users build any conversational applications. |
About performance, not only traffic (for multi-users) but also the latency (for single user) we should consider. The search experience is latency sensitive, if we introduced llm interactions in pre-processors and post-processors, the latency could be not acceptable. We should reduce the rounds of interactions with llm to improve latency and save the cost, as the llm api call is expensive. |
Actually we also have a RFC about conversation plugin in OpenSearch to support conversational application building. |
Thanks @davidlago. I agree that limiting access to the interactions to the owning user is the best we can do currently. We would love to collaborate on building the necessary security primitives to support access control on derived data. Please keep us informed on any future RFC on this. Yes, we're planning on restricting access to the conversational memory indices to the plugin / API. We'll be keeping track of the user under the covers in the index. |
@macohen Thanks for the clarification and suggestions. |
@jngz-es We agree that latency is important, and we will certainly look for ways to reduce unnecessary round-trips. That said, we have seen good results in some cases by using an LLM to both rewrite the query and summarize the response. We believe that some users will be okay with the additional latency for better results, and we want that to at least be an option. |
In a conversational search application I also think there can be some expected latency from users vs a keyword search. Think time for conversations are acceptable in general, right? |
Thanks to the folks who have responded to the RFC we posted a couple days ago for Conversations and Generative AI in OpenSearch. @jngz-es - I noticed that you recently posted an RFC on the same topic (#1151). I'm concerned that the overlap will cause confusion in the community and make it difficult to align our development. We would love to find a process where we can work together. The process that I’m used to in open source communities is to start with one RFC and then iterate and add feedback rather than creating multiple RFCs. This process has some benefits - it drives alignment in the open, enables the community to share and iterate on ideas, and makes the end product easy to understand and use. My suggestion is that we adopt this approach to work together on the RFC for conversational features in OpenSearch. We greatly appreciate the feedback you've already given this original RFC, and we'd be happy to do the work to update this RFC and continue to iterate to incorporate any other technical suggestions you have. Let us know what you think! We are excited to find ways to work together to make OpenSearch the best platform for building conversational applications. |
Love seeing multiple proposals for similar outcomes! Personally I don't think there's anything wrong with two competing implementations that potentially converge into the best in class. Without diving too much into details, @austintlee and @jngz-es, what are the similarities and differences between the two proposals? What do you think is better in the one you didn't write? |
Hi Austin, I love how you intuitively architected your application with the components in the intended way with the new building blocks like search pipelines, AI connectors and vector database capabilities. I expected that we needed to better document this. With that said, we are working on the next iteration of the framework to simplify and improve the developer experience. Some concepts that we're considering:
What are your thoughts? |
@macohen @msfroh Do you have any suggestions for how we might return answers generated by LLMs in the SearchResponse? I think there are largely three approaches. 1/ The most "intrusive" approach would be to introduce a new field in the SearchResponse, e.g.
2/ We can return it as one of the SearchHits by inserting the answer into the Hits array in the response processor (which means we would need reconstruct the response object on the way out). 3/ A middle ground would be an extension ("ext") to the response that can be customized by Search Pipelines:
Would this option be made possible as part of perhaps this work - opensearch-project/OpenSearch#8635? |
I really like the the proposal and have a few questions / comments, mostly around the conversation memory: Data store
Is this a hard requirement? It does feel like the most obvious place for it (since we're already running on OpenSearch, it adds no additional dependencies), but maybe someone might benefit from some other data store? Each conversation is an append-only log, if I'm understanding correctly, so another data store might be a good fit. (Of course, I hear that a lot of people like storing their append-only logs in OpenSearch indices, so maybe it really is the best option.) Metadata
If Other uses?At the risk of opening a can of worms, I'm wondering if such a proposal could help for other "session-based" search refinements. I'm thinking of an e-commerce application where someone searches for "black shoes", doesn't click on any search results, and then searches for "nike basketball shoes" -- you may want to rank the black shoes higher, on the assumption that the two queries are related. If you include a user identifier in the conversation metadata, the system could provide a more personalized experience based on prior conversations with the user (subject to the usual privacy concerns where you would need to let the user delete some or all conversations). Probably out of scope, though. There's been some discussion around interaction logging, incorporating the user's "post-search" actions (see opensearch-project/OpenSearch#4619), which feels like it overlaps a bit, though a "one size fits all" solution probably wouldn't be ideal. Still, I'm wondering if there's some opportunity for reuse or at least sharing lessons learned. |
Comparing with RFC-1151, the common part is a new plugin to store chat history. The differences from RFC-1151 are:
Basically I don't see major conflicts between these two RFCs from the implementation perspective, we can have both. We can have a new conversation plugin to store chat history, meanwhile provide chat API for applications. We can also have a new ml processor to run conversation/ml-commons APIs in search pipelines. Users can build conversational search in either way. |
@jngz-es thanks for sharing this, and excited to get more into the details in the GenAI meeting on Friday. I would encourage the community to have one way to build a conversational application, unless we saw a true need to have multiple approaches. It'll make the developer experience easier to learn for users interested in building applications. From proposal #1151, it seems like perhaps it could be split to make the idea more crisp (and renamed). For items that relate to building conversational search, we can use the comments on #1150 and iterate on that RFC to create the approach. It seems like the big, net new question in #1151 is whether OpenSearch should add the ability to create multi-agent architectures (in a similar direction to what LangChain does). I think this warrants a deeper discussion, as I wonder if OpenSearch should be trying to incorporate this versus having customers do this in their application stack, and let OpenSearch focus on a different set of primitives. By repurposing #1151 (and renaming it to channel this theme), I think we'd be able to more crisply outline each area and themes in the RFCs. Thoughts? |
@jonfritz I agree we should have one way to build a conversational application. I believe conversational search is one of them. Looks like #1150 is specific for conversational search, what about others applications like Chatbots? If customers want to build Chatbots on OpenSearch, should we provide another framework to support it? I don't think so, as we should have one way to build conversational applications. What do you think? |
@jngz-es clarifying question - how do you define a "chatbot" and how is that different than a conversational search interaction? From a customer perspective, I see customers wanting a natural language way to interact with their data stored in OpenSearch and leverage the generative aspects of LLMs to enrich and summarize those interactions and better understand the search query submitted (e.g. rewrites). We use the term "conversational search" to describe this, and a customer application could be considered a "chatbot" because it's a conversation with a natural language application. What use cases for natural language/chat interactions do you think would make sense for OpenSearch outside of this pattern? |
Comparing with #1151, another thing we'd like to have in common: prompt template management. I'll flesh out what I'm imagining in a little more detail than I think either RFC gives.
example template: Am I missing anything here? |
@jonfritz the use case I image is like a e-commerce customer using OpenSearch wants to build a chatbot for their customers to use to improve their customer experience on their e-commerce platform. It would be easy for OpenSearch user to build a chabot if we could support conversation-based application building. |
@HenryL27 I agree. Actually whether we only support conversational search or other conversational applications, probably we need something similar with LangChain as a framework to support building conversational applications including conversational search. So from the implementation perspective, I don't see major conflicts. |
@jngz-es interesting idea. I'm more interested in the specifics of how you see this chatbot being different from a conversational search interaction, though. Can you share a more detailed vision on what an eCommerce chatbot would do (e.g. what questions or commands it would respond to, and with what information)? FWIW - for me, it feels like a general chatbot application platform is outside the scope of how most customers would want to use OpenSearch. An arbitrary chat application (e.g. one that generates poetry) that's decoupled from the core OpenSearch purpose (accessing unstructured data) may be best suited for a different application stack. On the other hand, conversational search is more closely tied to OpenSearch, because it's a different way for customers to to interact with their data on the platform (through natural language search queries). I'd love to learn more about what your customers are asking for (and get into the details of a "chatbot"), and if they do want to build these types of apps in OpenSearch versus other methods - it'll be a good discussion for Friday's meeting. |
@jonfritz a chatbot not only could provide conversational search results but also improve entire shopping experiences from different perspectives. On top of search results, customers could have questions about products comparing, any coupons combination recommendation, products combination discount, return/refund policy, etc., basically something like specific knowledge stored in OpenSearch. |
… enable Retrieval Augmented Generation (RAG) (opensearch-project#1195) * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (opensearch-project#1150) Signed-off-by: Austin Lee <[email protected]> * Address test coverage. Signed-off-by: Austin Lee <[email protected]> * Fix/update imports due to changes coming from core. Signed-off-by: Austin Lee <[email protected]> * Update license header. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Use List for context fields so we can pull contexts from multiple fields when constructing contexts for LLMs. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Fix spotless issue. Signed-off-by: Austin Lee <[email protected]> * Update README. Signed-off-by: Austin Lee <[email protected]> * Fix ml-client shadowJar implicit dependency issue. Signed-off-by: Austin Lee <[email protected]> * Add a wrapper client for ML predict. Signed-off-by: Austin Lee <[email protected]> * Add tests for the internal ML client. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]>
* Conversational Memory for GenAI Apps (#1196) * moved code over Signed-off-by: HenryL27 <[email protected]> * added actions to MLPlugin; fixed io lib stuff Signed-off-by: HenryL27 <[email protected]> * fixed copyrights again Signed-off-by: HenryL27 <[email protected]> * Fix nullptr exception in .equals Signed-off-by: HenryL27 <[email protected]> * preserve thread context across action calls Signed-off-by: HenryL27 <[email protected]> * remove MissingResourceException from CreatInteractionRequest in favor of IOException Signed-off-by: HenryL27 <[email protected]> * move ConversationMet, Interaction, and Constants to common/conversational Signed-off-by: HenryL27 <[email protected]> * Sequentialize createInteraction to remove data race Signed-off-by: HenryL27 <[email protected]> * allow disorder when conversations have same timestamp Signed-off-by: HenryL27 <[email protected]> * lombokify Signed-off-by: HenryL27 <[email protected]> * add some unit testing Signed-off-by: HenryL27 <[email protected]> * Increase unit test coverage Signed-off-by: HenryL27 <[email protected]> * fix naming Signed-off-by: HenryL27 <[email protected]> * finish code coverage for actions Signed-off-by: HenryL27 <[email protected]> * Leave null values out of XContent per #1196 (comment) Signed-off-by: HenryL27 <[email protected]> * Add integration tests for rest actions Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * Complete unit testing for Index classes Signed-off-by: HenryL27 <[email protected]> * update build.gradle Signed-off-by: HenryL27 <[email protected]> * Finish unit tests Signed-off-by: HenryL27 <[email protected]> * Fail closed on missing convo access Signed-off-by: HenryL27 <[email protected]> * address code review/walkthrough comments Signed-off-by: HenryL27 <[email protected]> * re-add prompt temlplate and metadata fields at interaction level Signed-off-by: HenryL27 <[email protected]> * parse request body, not params, for post requests Signed-off-by: HenryL27 <[email protected]> * restructure with memory as higher-level term Signed-off-by: HenryL27 <[email protected]> * clean up build.gradle Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * change interaction field names timestamp -> create_time metadata -> additional_info Signed-off-by: HenryL27 <[email protected]> * fix GetInteractionsResponse xcontent tests Signed-off-by: HenryL27 <[email protected]> * propagate name change to variables and parameters Signed-off-by: HenryL27 <[email protected]> * clean logging and fix typos Signed-off-by: HenryL27 <[email protected]> * fix final convtructor according to find-and-replace Signed-off-by: HenryL27 <[email protected]> * append plugin-ml- to index names Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Feature/conversation memory feature flag (#1271) * add feature flag and checks to transport actions Signed-off-by: HenryL27 <[email protected]> * add feature flag tests Signed-off-by: HenryL27 <[email protected]> * fix typos for real with find-and-replace Signed-off-by: HenryL27 <[email protected]> * rename conversational-memory directory to memory Signed-off-by: HenryL27 <[email protected]> * fix settings.gradle with new dir name Signed-off-by: HenryL27 <[email protected]> * re-add feature flag checks and tests to transport layer Signed-off-by: HenryL27 <[email protected]> * fix feature flag with updateConsumer Signed-off-by: HenryL27 <[email protected]> * remove redundant settings update Signed-off-by: HenryL27 <[email protected]> * clean up feature var initialization to avoid unchecked conversion warning Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (#1195) * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (#1150) Signed-off-by: Austin Lee <[email protected]> * Address test coverage. Signed-off-by: Austin Lee <[email protected]> * Fix/update imports due to changes coming from core. Signed-off-by: Austin Lee <[email protected]> * Update license header. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Use List for context fields so we can pull contexts from multiple fields when constructing contexts for LLMs. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Fix spotless issue. Signed-off-by: Austin Lee <[email protected]> * Update README. Signed-off-by: Austin Lee <[email protected]> * Fix ml-client shadowJar implicit dependency issue. Signed-off-by: Austin Lee <[email protected]> * Add a wrapper client for ML predict. Signed-off-by: Austin Lee <[email protected]> * Add tests for the internal ML client. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * [Feature] Add Retrieval Augmented Generation search processors (#1275) * Put RAG pipeline behind a feature flag. Signed-off-by: Austin Lee <[email protected]> * Add support for chat history in RAG using the Conversational Memory API Signed-off-by: Austin Lee <[email protected]> * Fix spotless Signed-off-by: Austin Lee <[email protected]> * Fix RAG feature flag enablement. Signed-off-by: Austin Lee <[email protected]> * Address review comments and suggestions. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Add unit tests for MachineLearningPlugin Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * Allow RAG pipeline feature flag to be enabled and disabled dynamically (#1293) * Allow RAG pipeline feature flag to be enabled and disabled dynamically. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Add negative test cases for RAG feature flag being turned off. Signed-off-by: Austin Lee <[email protected]> * Improve error checking. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Co-authored-by: Austin Lee <[email protected]>
* Conversational Memory for GenAI Apps (#1196) * moved code over Signed-off-by: HenryL27 <[email protected]> * added actions to MLPlugin; fixed io lib stuff Signed-off-by: HenryL27 <[email protected]> * fixed copyrights again Signed-off-by: HenryL27 <[email protected]> * Fix nullptr exception in .equals Signed-off-by: HenryL27 <[email protected]> * preserve thread context across action calls Signed-off-by: HenryL27 <[email protected]> * remove MissingResourceException from CreatInteractionRequest in favor of IOException Signed-off-by: HenryL27 <[email protected]> * move ConversationMet, Interaction, and Constants to common/conversational Signed-off-by: HenryL27 <[email protected]> * Sequentialize createInteraction to remove data race Signed-off-by: HenryL27 <[email protected]> * allow disorder when conversations have same timestamp Signed-off-by: HenryL27 <[email protected]> * lombokify Signed-off-by: HenryL27 <[email protected]> * add some unit testing Signed-off-by: HenryL27 <[email protected]> * Increase unit test coverage Signed-off-by: HenryL27 <[email protected]> * fix naming Signed-off-by: HenryL27 <[email protected]> * finish code coverage for actions Signed-off-by: HenryL27 <[email protected]> * Leave null values out of XContent per #1196 (comment) Signed-off-by: HenryL27 <[email protected]> * Add integration tests for rest actions Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * Complete unit testing for Index classes Signed-off-by: HenryL27 <[email protected]> * update build.gradle Signed-off-by: HenryL27 <[email protected]> * Finish unit tests Signed-off-by: HenryL27 <[email protected]> * Fail closed on missing convo access Signed-off-by: HenryL27 <[email protected]> * address code review/walkthrough comments Signed-off-by: HenryL27 <[email protected]> * re-add prompt temlplate and metadata fields at interaction level Signed-off-by: HenryL27 <[email protected]> * parse request body, not params, for post requests Signed-off-by: HenryL27 <[email protected]> * restructure with memory as higher-level term Signed-off-by: HenryL27 <[email protected]> * clean up build.gradle Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * change interaction field names timestamp -> create_time metadata -> additional_info Signed-off-by: HenryL27 <[email protected]> * fix GetInteractionsResponse xcontent tests Signed-off-by: HenryL27 <[email protected]> * propagate name change to variables and parameters Signed-off-by: HenryL27 <[email protected]> * clean logging and fix typos Signed-off-by: HenryL27 <[email protected]> * fix final convtructor according to find-and-replace Signed-off-by: HenryL27 <[email protected]> * append plugin-ml- to index names Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Feature/conversation memory feature flag (#1271) * add feature flag and checks to transport actions Signed-off-by: HenryL27 <[email protected]> * add feature flag tests Signed-off-by: HenryL27 <[email protected]> * fix typos for real with find-and-replace Signed-off-by: HenryL27 <[email protected]> * rename conversational-memory directory to memory Signed-off-by: HenryL27 <[email protected]> * fix settings.gradle with new dir name Signed-off-by: HenryL27 <[email protected]> * re-add feature flag checks and tests to transport layer Signed-off-by: HenryL27 <[email protected]> * fix feature flag with updateConsumer Signed-off-by: HenryL27 <[email protected]> * remove redundant settings update Signed-off-by: HenryL27 <[email protected]> * clean up feature var initialization to avoid unchecked conversion warning Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (#1195) * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (#1150) Signed-off-by: Austin Lee <[email protected]> * Address test coverage. Signed-off-by: Austin Lee <[email protected]> * Fix/update imports due to changes coming from core. Signed-off-by: Austin Lee <[email protected]> * Update license header. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Use List for context fields so we can pull contexts from multiple fields when constructing contexts for LLMs. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Fix spotless issue. Signed-off-by: Austin Lee <[email protected]> * Update README. Signed-off-by: Austin Lee <[email protected]> * Fix ml-client shadowJar implicit dependency issue. Signed-off-by: Austin Lee <[email protected]> * Add a wrapper client for ML predict. Signed-off-by: Austin Lee <[email protected]> * Add tests for the internal ML client. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * [Feature] Add Retrieval Augmented Generation search processors (#1275) * Put RAG pipeline behind a feature flag. Signed-off-by: Austin Lee <[email protected]> * Add support for chat history in RAG using the Conversational Memory API Signed-off-by: Austin Lee <[email protected]> * Fix spotless Signed-off-by: Austin Lee <[email protected]> * Fix RAG feature flag enablement. Signed-off-by: Austin Lee <[email protected]> * Address review comments and suggestions. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Add unit tests for MachineLearningPlugin Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * Allow RAG pipeline feature flag to be enabled and disabled dynamically (#1293) * Allow RAG pipeline feature flag to be enabled and disabled dynamically. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Add negative test cases for RAG feature flag being turned off. Signed-off-by: Austin Lee <[email protected]> * Improve error checking. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Co-authored-by: Austin Lee <[email protected]> (cherry picked from commit 1112612)
* Conversational Memory for GenAI Apps (#1196) * moved code over Signed-off-by: HenryL27 <[email protected]> * added actions to MLPlugin; fixed io lib stuff Signed-off-by: HenryL27 <[email protected]> * fixed copyrights again Signed-off-by: HenryL27 <[email protected]> * Fix nullptr exception in .equals Signed-off-by: HenryL27 <[email protected]> * preserve thread context across action calls Signed-off-by: HenryL27 <[email protected]> * remove MissingResourceException from CreatInteractionRequest in favor of IOException Signed-off-by: HenryL27 <[email protected]> * move ConversationMet, Interaction, and Constants to common/conversational Signed-off-by: HenryL27 <[email protected]> * Sequentialize createInteraction to remove data race Signed-off-by: HenryL27 <[email protected]> * allow disorder when conversations have same timestamp Signed-off-by: HenryL27 <[email protected]> * lombokify Signed-off-by: HenryL27 <[email protected]> * add some unit testing Signed-off-by: HenryL27 <[email protected]> * Increase unit test coverage Signed-off-by: HenryL27 <[email protected]> * fix naming Signed-off-by: HenryL27 <[email protected]> * finish code coverage for actions Signed-off-by: HenryL27 <[email protected]> * Leave null values out of XContent per #1196 (comment) Signed-off-by: HenryL27 <[email protected]> * Add integration tests for rest actions Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * Complete unit testing for Index classes Signed-off-by: HenryL27 <[email protected]> * update build.gradle Signed-off-by: HenryL27 <[email protected]> * Finish unit tests Signed-off-by: HenryL27 <[email protected]> * Fail closed on missing convo access Signed-off-by: HenryL27 <[email protected]> * address code review/walkthrough comments Signed-off-by: HenryL27 <[email protected]> * re-add prompt temlplate and metadata fields at interaction level Signed-off-by: HenryL27 <[email protected]> * parse request body, not params, for post requests Signed-off-by: HenryL27 <[email protected]> * restructure with memory as higher-level term Signed-off-by: HenryL27 <[email protected]> * clean up build.gradle Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * change interaction field names timestamp -> create_time metadata -> additional_info Signed-off-by: HenryL27 <[email protected]> * fix GetInteractionsResponse xcontent tests Signed-off-by: HenryL27 <[email protected]> * propagate name change to variables and parameters Signed-off-by: HenryL27 <[email protected]> * clean logging and fix typos Signed-off-by: HenryL27 <[email protected]> * fix final convtructor according to find-and-replace Signed-off-by: HenryL27 <[email protected]> * append plugin-ml- to index names Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Feature/conversation memory feature flag (#1271) * add feature flag and checks to transport actions Signed-off-by: HenryL27 <[email protected]> * add feature flag tests Signed-off-by: HenryL27 <[email protected]> * fix typos for real with find-and-replace Signed-off-by: HenryL27 <[email protected]> * rename conversational-memory directory to memory Signed-off-by: HenryL27 <[email protected]> * fix settings.gradle with new dir name Signed-off-by: HenryL27 <[email protected]> * re-add feature flag checks and tests to transport layer Signed-off-by: HenryL27 <[email protected]> * fix feature flag with updateConsumer Signed-off-by: HenryL27 <[email protected]> * remove redundant settings update Signed-off-by: HenryL27 <[email protected]> * clean up feature var initialization to avoid unchecked conversion warning Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (#1195) * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (#1150) Signed-off-by: Austin Lee <[email protected]> * Address test coverage. Signed-off-by: Austin Lee <[email protected]> * Fix/update imports due to changes coming from core. Signed-off-by: Austin Lee <[email protected]> * Update license header. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Use List for context fields so we can pull contexts from multiple fields when constructing contexts for LLMs. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Fix spotless issue. Signed-off-by: Austin Lee <[email protected]> * Update README. Signed-off-by: Austin Lee <[email protected]> * Fix ml-client shadowJar implicit dependency issue. Signed-off-by: Austin Lee <[email protected]> * Add a wrapper client for ML predict. Signed-off-by: Austin Lee <[email protected]> * Add tests for the internal ML client. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * [Feature] Add Retrieval Augmented Generation search processors (#1275) * Put RAG pipeline behind a feature flag. Signed-off-by: Austin Lee <[email protected]> * Add support for chat history in RAG using the Conversational Memory API Signed-off-by: Austin Lee <[email protected]> * Fix spotless Signed-off-by: Austin Lee <[email protected]> * Fix RAG feature flag enablement. Signed-off-by: Austin Lee <[email protected]> * Address review comments and suggestions. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Add unit tests for MachineLearningPlugin Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * Allow RAG pipeline feature flag to be enabled and disabled dynamically (#1293) * Allow RAG pipeline feature flag to be enabled and disabled dynamically. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Add negative test cases for RAG feature flag being turned off. Signed-off-by: Austin Lee <[email protected]> * Improve error checking. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Co-authored-by: Austin Lee <[email protected]> (cherry picked from commit 1112612) Co-authored-by: HenryL27 <[email protected]>
@austintlee Should this issue be moved to 2.11? |
Let me just quickly highlight what is being released in 2.10.
So, most of what we mentioned above in the RFC should be coming out in 2.10 as an experimental feature. It is being made available via the ml-commons plugin so it should be fairly easy for people to try out. We will have a tutorial to go with this release on how to use this feature. Our work is not done. We want to make sure this feature goes GA by 2.11. We have some improvements we have in mind. We are excited to make this available in 2.10 and are looking forward to feedback and suggestions. There are a lot of interesting things people are doing in the RAG space and we would love to work with the community to bring these ideas to OpenSearch! |
@austintlee, can you clarify the purpose of this query block in the example that you provided in the RFC? "ext": { It's not clear why it repeats the query context: Reference: GET wiki-simple-paras/_search?search_pipeline=convo_qa_pipeline |
Oftentimes, you may want to customize your query to OpenSearch (hybrid search, e.g.) and feed the result as additional context to an LLM so the current interface allows applications to construct the OS query and the LLM question as two inputs. In trying to keep the example simple, I may have made it a bit confusing since it repeats the same question twice. But let's say you want to ask a follow-up question - "when did he die?" In this case, you won't want to pass that question as-is to OpenSearch as it won't know what you mean by "he". But the LLM will figure it out based on the chat history. Using the 2.10 Release Candidate, I made some sample queries to demonstrate the point: Query 1 (BM25 + KNN)
Query 2 (Term only)
We can introduce question rewriting (when did he die -> when did Abraham Lincoln die), but this may require some new work in SearchQueryBuilder, maybe an extension similar to what neural search and hybrid search did (e.g. ConversationalSearchQueryBuilder). |
@austintlee, so the query clause is the retriever part of the RAG workflow, correct? So, when a neural search query is being used with this pipeline, the initial query will probably be redundant. Is the idea that the subsequent queries like "when did he die" will be passed via "llm_question" and the neural search query will keep the original query context like "what was Abraham Lincoln's life like?" What controls do I have around history context. I see examples where you can provide a conversation (session) id. Can I dynamically specify the context history like "last=N" exchanges? Also, have you thought about extending the neural search interface so that we can avoid repeated questions in the query syntax? |
Yes, we want to tackle this in the next iteration. This will simplify the experience. I think confusion here is coming from the fact that you have to enter each question twice when it doesn't have to be that way. As I stated above, I am considering a new search query type that gives the user the flexibility to ask one question or one question + an OpenSearch query (I gave two examples of this above). |
* Conversational Memory for GenAI Apps (opensearch-project#1196) * moved code over Signed-off-by: HenryL27 <[email protected]> * added actions to MLPlugin; fixed io lib stuff Signed-off-by: HenryL27 <[email protected]> * fixed copyrights again Signed-off-by: HenryL27 <[email protected]> * Fix nullptr exception in .equals Signed-off-by: HenryL27 <[email protected]> * preserve thread context across action calls Signed-off-by: HenryL27 <[email protected]> * remove MissingResourceException from CreatInteractionRequest in favor of IOException Signed-off-by: HenryL27 <[email protected]> * move ConversationMet, Interaction, and Constants to common/conversational Signed-off-by: HenryL27 <[email protected]> * Sequentialize createInteraction to remove data race Signed-off-by: HenryL27 <[email protected]> * allow disorder when conversations have same timestamp Signed-off-by: HenryL27 <[email protected]> * lombokify Signed-off-by: HenryL27 <[email protected]> * add some unit testing Signed-off-by: HenryL27 <[email protected]> * Increase unit test coverage Signed-off-by: HenryL27 <[email protected]> * fix naming Signed-off-by: HenryL27 <[email protected]> * finish code coverage for actions Signed-off-by: HenryL27 <[email protected]> * Leave null values out of XContent per opensearch-project#1196 (comment) Signed-off-by: HenryL27 <[email protected]> * Add integration tests for rest actions Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * Complete unit testing for Index classes Signed-off-by: HenryL27 <[email protected]> * update build.gradle Signed-off-by: HenryL27 <[email protected]> * Finish unit tests Signed-off-by: HenryL27 <[email protected]> * Fail closed on missing convo access Signed-off-by: HenryL27 <[email protected]> * address code review/walkthrough comments Signed-off-by: HenryL27 <[email protected]> * re-add prompt temlplate and metadata fields at interaction level Signed-off-by: HenryL27 <[email protected]> * parse request body, not params, for post requests Signed-off-by: HenryL27 <[email protected]> * restructure with memory as higher-level term Signed-off-by: HenryL27 <[email protected]> * clean up build.gradle Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * change interaction field names timestamp -> create_time metadata -> additional_info Signed-off-by: HenryL27 <[email protected]> * fix GetInteractionsResponse xcontent tests Signed-off-by: HenryL27 <[email protected]> * propagate name change to variables and parameters Signed-off-by: HenryL27 <[email protected]> * clean logging and fix typos Signed-off-by: HenryL27 <[email protected]> * fix final convtructor according to find-and-replace Signed-off-by: HenryL27 <[email protected]> * append plugin-ml- to index names Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Feature/conversation memory feature flag (opensearch-project#1271) * add feature flag and checks to transport actions Signed-off-by: HenryL27 <[email protected]> * add feature flag tests Signed-off-by: HenryL27 <[email protected]> * fix typos for real with find-and-replace Signed-off-by: HenryL27 <[email protected]> * rename conversational-memory directory to memory Signed-off-by: HenryL27 <[email protected]> * fix settings.gradle with new dir name Signed-off-by: HenryL27 <[email protected]> * re-add feature flag checks and tests to transport layer Signed-off-by: HenryL27 <[email protected]> * fix feature flag with updateConsumer Signed-off-by: HenryL27 <[email protected]> * remove redundant settings update Signed-off-by: HenryL27 <[email protected]> * clean up feature var initialization to avoid unchecked conversion warning Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (opensearch-project#1195) * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (opensearch-project#1150) Signed-off-by: Austin Lee <[email protected]> * Address test coverage. Signed-off-by: Austin Lee <[email protected]> * Fix/update imports due to changes coming from core. Signed-off-by: Austin Lee <[email protected]> * Update license header. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Use List for context fields so we can pull contexts from multiple fields when constructing contexts for LLMs. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Fix spotless issue. Signed-off-by: Austin Lee <[email protected]> * Update README. Signed-off-by: Austin Lee <[email protected]> * Fix ml-client shadowJar implicit dependency issue. Signed-off-by: Austin Lee <[email protected]> * Add a wrapper client for ML predict. Signed-off-by: Austin Lee <[email protected]> * Add tests for the internal ML client. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * [Feature] Add Retrieval Augmented Generation search processors (opensearch-project#1275) * Put RAG pipeline behind a feature flag. Signed-off-by: Austin Lee <[email protected]> * Add support for chat history in RAG using the Conversational Memory API Signed-off-by: Austin Lee <[email protected]> * Fix spotless Signed-off-by: Austin Lee <[email protected]> * Fix RAG feature flag enablement. Signed-off-by: Austin Lee <[email protected]> * Address review comments and suggestions. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Add unit tests for MachineLearningPlugin Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * Allow RAG pipeline feature flag to be enabled and disabled dynamically (opensearch-project#1293) * Allow RAG pipeline feature flag to be enabled and disabled dynamically. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Add negative test cases for RAG feature flag being turned off. Signed-off-by: Austin Lee <[email protected]> * Improve error checking. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Co-authored-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]>
* Conversational Memory for GenAI Apps (opensearch-project#1196) * moved code over Signed-off-by: HenryL27 <[email protected]> * added actions to MLPlugin; fixed io lib stuff Signed-off-by: HenryL27 <[email protected]> * fixed copyrights again Signed-off-by: HenryL27 <[email protected]> * Fix nullptr exception in .equals Signed-off-by: HenryL27 <[email protected]> * preserve thread context across action calls Signed-off-by: HenryL27 <[email protected]> * remove MissingResourceException from CreatInteractionRequest in favor of IOException Signed-off-by: HenryL27 <[email protected]> * move ConversationMet, Interaction, and Constants to common/conversational Signed-off-by: HenryL27 <[email protected]> * Sequentialize createInteraction to remove data race Signed-off-by: HenryL27 <[email protected]> * allow disorder when conversations have same timestamp Signed-off-by: HenryL27 <[email protected]> * lombokify Signed-off-by: HenryL27 <[email protected]> * add some unit testing Signed-off-by: HenryL27 <[email protected]> * Increase unit test coverage Signed-off-by: HenryL27 <[email protected]> * fix naming Signed-off-by: HenryL27 <[email protected]> * finish code coverage for actions Signed-off-by: HenryL27 <[email protected]> * Leave null values out of XContent per opensearch-project#1196 (comment) Signed-off-by: HenryL27 <[email protected]> * Add integration tests for rest actions Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * Complete unit testing for Index classes Signed-off-by: HenryL27 <[email protected]> * update build.gradle Signed-off-by: HenryL27 <[email protected]> * Finish unit tests Signed-off-by: HenryL27 <[email protected]> * Fail closed on missing convo access Signed-off-by: HenryL27 <[email protected]> * address code review/walkthrough comments Signed-off-by: HenryL27 <[email protected]> * re-add prompt temlplate and metadata fields at interaction level Signed-off-by: HenryL27 <[email protected]> * parse request body, not params, for post requests Signed-off-by: HenryL27 <[email protected]> * restructure with memory as higher-level term Signed-off-by: HenryL27 <[email protected]> * clean up build.gradle Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * change interaction field names timestamp -> create_time metadata -> additional_info Signed-off-by: HenryL27 <[email protected]> * fix GetInteractionsResponse xcontent tests Signed-off-by: HenryL27 <[email protected]> * propagate name change to variables and parameters Signed-off-by: HenryL27 <[email protected]> * clean logging and fix typos Signed-off-by: HenryL27 <[email protected]> * fix final convtructor according to find-and-replace Signed-off-by: HenryL27 <[email protected]> * append plugin-ml- to index names Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Feature/conversation memory feature flag (opensearch-project#1271) * add feature flag and checks to transport actions Signed-off-by: HenryL27 <[email protected]> * add feature flag tests Signed-off-by: HenryL27 <[email protected]> * fix typos for real with find-and-replace Signed-off-by: HenryL27 <[email protected]> * rename conversational-memory directory to memory Signed-off-by: HenryL27 <[email protected]> * fix settings.gradle with new dir name Signed-off-by: HenryL27 <[email protected]> * re-add feature flag checks and tests to transport layer Signed-off-by: HenryL27 <[email protected]> * fix feature flag with updateConsumer Signed-off-by: HenryL27 <[email protected]> * remove redundant settings update Signed-off-by: HenryL27 <[email protected]> * clean up feature var initialization to avoid unchecked conversion warning Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (opensearch-project#1195) * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (opensearch-project#1150) Signed-off-by: Austin Lee <[email protected]> * Address test coverage. Signed-off-by: Austin Lee <[email protected]> * Fix/update imports due to changes coming from core. Signed-off-by: Austin Lee <[email protected]> * Update license header. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Use List for context fields so we can pull contexts from multiple fields when constructing contexts for LLMs. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Fix spotless issue. Signed-off-by: Austin Lee <[email protected]> * Update README. Signed-off-by: Austin Lee <[email protected]> * Fix ml-client shadowJar implicit dependency issue. Signed-off-by: Austin Lee <[email protected]> * Add a wrapper client for ML predict. Signed-off-by: Austin Lee <[email protected]> * Add tests for the internal ML client. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * [Feature] Add Retrieval Augmented Generation search processors (opensearch-project#1275) * Put RAG pipeline behind a feature flag. Signed-off-by: Austin Lee <[email protected]> * Add support for chat history in RAG using the Conversational Memory API Signed-off-by: Austin Lee <[email protected]> * Fix spotless Signed-off-by: Austin Lee <[email protected]> * Fix RAG feature flag enablement. Signed-off-by: Austin Lee <[email protected]> * Address review comments and suggestions. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Add unit tests for MachineLearningPlugin Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * Allow RAG pipeline feature flag to be enabled and disabled dynamically (opensearch-project#1293) * Allow RAG pipeline feature flag to be enabled and disabled dynamically. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Add negative test cases for RAG feature flag being turned off. Signed-off-by: Austin Lee <[email protected]> * Improve error checking. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Co-authored-by: Austin Lee <[email protected]> (cherry picked from commit 1112612) Signed-off-by: HenryL27 <[email protected]>
* Feature/conversation backport to 2.x (#1286) * Conversational Memory for GenAI Apps (#1196) * moved code over Signed-off-by: HenryL27 <[email protected]> * added actions to MLPlugin; fixed io lib stuff Signed-off-by: HenryL27 <[email protected]> * fixed copyrights again Signed-off-by: HenryL27 <[email protected]> * Fix nullptr exception in .equals Signed-off-by: HenryL27 <[email protected]> * preserve thread context across action calls Signed-off-by: HenryL27 <[email protected]> * remove MissingResourceException from CreatInteractionRequest in favor of IOException Signed-off-by: HenryL27 <[email protected]> * move ConversationMet, Interaction, and Constants to common/conversational Signed-off-by: HenryL27 <[email protected]> * Sequentialize createInteraction to remove data race Signed-off-by: HenryL27 <[email protected]> * allow disorder when conversations have same timestamp Signed-off-by: HenryL27 <[email protected]> * lombokify Signed-off-by: HenryL27 <[email protected]> * add some unit testing Signed-off-by: HenryL27 <[email protected]> * Increase unit test coverage Signed-off-by: HenryL27 <[email protected]> * fix naming Signed-off-by: HenryL27 <[email protected]> * finish code coverage for actions Signed-off-by: HenryL27 <[email protected]> * Leave null values out of XContent per #1196 (comment) Signed-off-by: HenryL27 <[email protected]> * Add integration tests for rest actions Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * Complete unit testing for Index classes Signed-off-by: HenryL27 <[email protected]> * update build.gradle Signed-off-by: HenryL27 <[email protected]> * Finish unit tests Signed-off-by: HenryL27 <[email protected]> * Fail closed on missing convo access Signed-off-by: HenryL27 <[email protected]> * address code review/walkthrough comments Signed-off-by: HenryL27 <[email protected]> * re-add prompt temlplate and metadata fields at interaction level Signed-off-by: HenryL27 <[email protected]> * parse request body, not params, for post requests Signed-off-by: HenryL27 <[email protected]> * restructure with memory as higher-level term Signed-off-by: HenryL27 <[email protected]> * clean up build.gradle Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> * change interaction field names timestamp -> create_time metadata -> additional_info Signed-off-by: HenryL27 <[email protected]> * fix GetInteractionsResponse xcontent tests Signed-off-by: HenryL27 <[email protected]> * propagate name change to variables and parameters Signed-off-by: HenryL27 <[email protected]> * clean logging and fix typos Signed-off-by: HenryL27 <[email protected]> * fix final convtructor according to find-and-replace Signed-off-by: HenryL27 <[email protected]> * append plugin-ml- to index names Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Feature/conversation memory feature flag (#1271) * add feature flag and checks to transport actions Signed-off-by: HenryL27 <[email protected]> * add feature flag tests Signed-off-by: HenryL27 <[email protected]> * fix typos for real with find-and-replace Signed-off-by: HenryL27 <[email protected]> * rename conversational-memory directory to memory Signed-off-by: HenryL27 <[email protected]> * fix settings.gradle with new dir name Signed-off-by: HenryL27 <[email protected]> * re-add feature flag checks and tests to transport layer Signed-off-by: HenryL27 <[email protected]> * fix feature flag with updateConsumer Signed-off-by: HenryL27 <[email protected]> * remove redundant settings update Signed-off-by: HenryL27 <[email protected]> * clean up feature var initialization to avoid unchecked conversion warning Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (#1195) * Use Search Pipeline processors, Remote Inference and HttpConnector to enable Retrieval Augmented Generation (RAG) (#1150) Signed-off-by: Austin Lee <[email protected]> * Address test coverage. Signed-off-by: Austin Lee <[email protected]> * Fix/update imports due to changes coming from core. Signed-off-by: Austin Lee <[email protected]> * Update license header. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Use List for context fields so we can pull contexts from multiple fields when constructing contexts for LLMs. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Fix spotless issue. Signed-off-by: Austin Lee <[email protected]> * Update README. Signed-off-by: Austin Lee <[email protected]> * Fix ml-client shadowJar implicit dependency issue. Signed-off-by: Austin Lee <[email protected]> * Add a wrapper client for ML predict. Signed-off-by: Austin Lee <[email protected]> * Add tests for the internal ML client. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * [Feature] Add Retrieval Augmented Generation search processors (#1275) * Put RAG pipeline behind a feature flag. Signed-off-by: Austin Lee <[email protected]> * Add support for chat history in RAG using the Conversational Memory API Signed-off-by: Austin Lee <[email protected]> * Fix spotless Signed-off-by: Austin Lee <[email protected]> * Fix RAG feature flag enablement. Signed-off-by: Austin Lee <[email protected]> * Address review comments and suggestions. Signed-off-by: Austin Lee <[email protected]> * Address comments. Signed-off-by: Austin Lee <[email protected]> * Add unit tests for MachineLearningPlugin Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * Allow RAG pipeline feature flag to be enabled and disabled dynamically (#1293) * Allow RAG pipeline feature flag to be enabled and disabled dynamically. Signed-off-by: Austin Lee <[email protected]> * Address review comments. Signed-off-by: Austin Lee <[email protected]> * Add negative test cases for RAG feature flag being turned off. Signed-off-by: Austin Lee <[email protected]> * Improve error checking. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> Signed-off-by: HenryL27 <[email protected]> * apply spotless Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Co-authored-by: Austin Lee <[email protected]> (cherry picked from commit 1112612) Signed-off-by: HenryL27 <[email protected]> * fix http library version Signed-off-by: HenryL27 <[email protected]> --------- Signed-off-by: HenryL27 <[email protected]> Signed-off-by: Austin Lee <[email protected]> Signed-off-by: Austin Lee <[email protected]> Co-authored-by: Austin Lee <[email protected]>
Hi @austintlee, since the conversational memory will grow over time according to the number of conversational searches customers made. Do we have any idea about conversation memory scalability? |
@ylwu-amzn how's the appsec review going? Are we gonna hit GA for 2.12? thx |
Folks, We know that the conversational memory feature is experimental because of internal AWS processes needed to test the integrity of the feature. Can we get a status on how your review and testing process is going? I believe these features were scheduled to be GA in 2.12. Since 2.12 is not delayed until 20 Feb, I am assuming that we have almost cleared the hurdle. |
@mashah yes we are on track to GA the feature in 2.12. The pentest is scheduled to start 23 Jan, and finish by 31 Jan. we still have a couple of week's time to fix if any security issues caught in the test. |
Yes, hope we don't have much issues for PenTest. If they find any issue, we will share to your team . |
Is the pentest on track for next week on 23 Jan for the conversational memory features? |
yes we are on track |
Can this be closed? |
It's GA in 2.12. Closing. |
Introduction
The recent advances in Large Language Models (LLMs) have enabled developers to utilize natural language in their applications with better quality and ability. As ChatGPT has shown, these LLMs strongly enable use cases involving summarization and conversation. However, when prompting LLMs to answer fact-based questions (applications we call “conversational search”), we find that there are significant shortcomings for enterprise-grade applications.
First, the major LLMs are not trained on datasets that are not exposed to the internet, and therefore do not have the context to answer questions on private data. Most enterprise data falls into this category. Second, the way in which LLMs answer questions based on their training data gives rise to “hallucinations” and false answers, which are not acceptable in applications for mission critical use cases.
End-users love the ability to converse using colloquial language with an application to get answers to questions or find interesting search results, but require up-to-date information and accuracy. A solution to this problem is through Retrieval Augmented Generation (RAG), where an application sends an LLM a superset of correct information in response to a prompt, and the LLM is used to summarize and extract information from this set (instead of probabilistically determining an answer).
We believe OpenSearch could be a great platform for building conversational search applications, and aligns well with the RAG approach. It already offers semantic search capabilities using its vector database and k-NN plug-in, alongside enterprise-grade security and scalability. This is a great building block for the “source of truth” information retrieval component of RAG. However, it currently lacks the primitives and crisp APIs to easily enable the conversational element.
Although there are libraries that allow for building this functionality at the application layer (e.g. LangChain), we believe the best developer experience would be to enable this directly in OpenSearch. We consider the “G” in a RAG pipeline as LLM-based post-processing to enable direct question answering, summarization, and a conversational experience on top of OpenSearch semantic search. This enables end-users to interact with their data in OpenSearch in new ways. Furthermore, we believe developers may want to use different LLMs, and that the choice of model should be pluggable.
Through using plugins and search pipelines, we propose an architecture in this RFC to expose easily consumable APIs for conversational search, history, and storage. We segment it into a few components, including: 1/search query rewriting using generative AI and conversational context, 2/question answering and summarization of OpenSearch semantic search queries using generative AI, and 3/a concept of “conversational memory” to easily store the state of conversations and add additional interactions. Conversational Memory will also support conversational applications that have multiple agents operating together, giving a single source of truth for conversation state.
Goals
1/ Developers can easily build conversational search applications (e.g. knowledge-base search, informational chatbot, etc.) using OpenSearch and their choice of generative AI model using well-defined REST APIs. Some of these applications will be an ongoing conversation, while others will be one-shot (and the history of interactions is not important).
2/ Developers can use OpenSearch to support multi-agent conversational architectures, which require a single “source of truth” for conversational history. Multi-agent architectures will have other agents besides that for semantic search with OpenSearch (e.g. an agent that queries the public internet). These developers need an easy API to manage conversational history, both in adding interactions to conversations and exploring history of those conversations.
3/ Developers can easily obtain OpenSearch (semantic) search results alongside the generative AI question answering, so they can show the source documents and enable the end user to explore the source material.
Non-Goals
1/ Building a general LLM application toolkit in OpenSearch. Our goal is just to enable conversational search and the related dependency of conversational memory.
2/ LLM hosting. LLMs take significant resources and should be operated outside of an OpenSearch cluster. We also hope to use the ML-Commons remote inference feature rather than implement our own connectors.
3/ A conversational search application platform. Our goal is to expose crisp APIs to make building applications that use conversational search easy, but not create the end application itself.
Proposed Architecture
Conversational Memory API (Chat History)
Conversational memory is the storage for conversations, which are an ordered list of interactions. Conversational memory makes it easy to add new interactions to a conversation or explore previous interactions. For example, you would need conversational memory to write a chatbot, since it takes the previous interactions in a conversation as part of the context for generating a future response. At a high level, this mostly resembles a generic read/write store, and we will use an OpenSearch index for it. However, the interesting nuance is in the data itself, which we will describe next.
A conversation is represented as a list of interactions, ordered chronologically. Each conversation will also include some metadata, like the start time and the number of interactions.
The basic elements of an interaction are an input and a response, representing the human input to an AI agent and that agent’s response. We’ll also include any additional prompting that was used in the interaction, the agent that was used in this interaction, and possible arbitrary metadata that the agent may want to include. For example, a conversational search agent may include the actual search results as metadata for a user search query (which is an interaction).
Each
ConversationMetadata
andInteraction
will have access controls linked to the specific user that creates them. Only Alice can add to and read from conversations that Alice owns. The main rationale for this is that Alice’s conversation will potentially include information from all documents Alice has access to, so her conversations’ access controls are maximally the intersection of Alice’s access rights. We plan to leverage OpenSearch’s existing access control mechanisms for this.The plan is to maintain 2 indices - 1 for
ConversationMetadata
and 1 forInteraction
.API
The operations for conversational memory are similar to the usual CRUD operations for a datastore.
CreateInteraction
will update the appropriateConversationMetadata
to have a correctlastInteractionTime
andnumInteractions
We do not propose having an update API for conversation metadata, and we treat this as immutable. We believe that users would prefer to just create a new conversation than update parameters on an existing one.
Search Pipeline extension
The conversational search path essentially consists of an OpenSearch query, with some pre- and post-processing. Search Pipelines, introduced in 2.8, are a tool for pre- and post-processing in the query path, so we have chosen to use that mechanism to implement conversational search.
We have chosen to implement the question answering component of RAG in the form of query result rewrites. We are introducing a new response processor that sends the top search results, and optionally some previous conversation history to the LLM to generate a response in the conversation. We are also introducing a new response processor that iterates over search hits and interacts with an LLM to produce an answer for each result with a score. Finally, we are introducing a request processor to rephrase the user’s query, taking into account the conversation history. We will rely on the remote inference feature proposed in #882 for answer generation.
Based on different patterns we have seen with applications, we designed this API to support “one-off” and “multi-shot” conversations. Users can have “one-off” question answering interactions, where the prior context is not included, via a search pipeline that uses this new question answering processor. Users can also have “multi-shot” conversations where interactions are stored in conversational memory and are used as additional context that is sent to the model along with each search query. Users will need to use the Conversational Search plugin to create a conversation and pass the conversationId to the search pipeline in order to retain all the interactions associated with it.
In addition to the conversation ID, users can also pass a “prompt” parameter for any prompt engineering alongside their search query.
The search pipeline includes pre and post processing steps. The pre-processing step uses generative AI to rewrite the search query submitted by the user, taking into account the conversation history if a conversation was specified. This allows things like antecedent replacement (”When was he born?” → “When was Abraham Lincoln born?”, if the prior question was “Who was Abraham Lincoln?”).
The post-processing step is a processor that takes the search results, optionally performs a lookup against the conversational memory, and then sends this data to the LLM configured by the user. We believe different users will want to use different LLMs, so this will be pluggable.
Conversation API
The point of this API is to provide conversational search as a relatively simple endpoint, hooking pieces together such that the user can easily build an application with it. It takes a search query (or some other kind of human input), performs a search against OpenSearch, and then feeds those search results into an LLM and returns the answer. All of this work is done in the search pipeline underneath - so the API is just a wrapper - but we feel this kind of an API would be helpful to developers who just want an easy REST API.
We would like to return search results as well as the LLM response. This differs from most existing systems that return only answers, and it allows clients to perform validations or additional downstream processing.
Discussion
Summary
In this RFC we gave a proposal for bringing conversational search into OpenSearch. Our proposal consists of three components: 1/ an API for conversational memory stored in OpenSearch, 2/ an OpenSearch search pipeline for Retrieval-Augmented Generation (RAG), and 3/ an API that provides a simple one-shot API for conversational search applications. We would appreciate any feedback, suggestions, and comments towards integrating this cleanly with the rest of the OpenSearch ecosystem and making it the best it can be.
Thanks!
Requested Feedback
The text was updated successfully, but these errors were encountered: