-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"managed" Ragna #256
Comments
Shower thought: what if we create a new Line 150 in a437e7e
Meaning, if a user supplies an already prepared corpus, they can just start chatting. On the Unless I'm missing something big, this could actually work and potentially fix all the issues that we saw above regarding the Python / REST API. If only my day had more than 24 hours ... |
TL;DR I'm happy to continue using the issue tracker for contributing. I'm also happy to accept that not all issues will lead to a new feature being merged into the codebase. Perhaps we can see what a 'managed' Ragna would look like via a bunch of exploratory PRs, and keep the project momentum going?
I would be keen to keep the momentum going that the
I'm happy to contribute here! Perhaps the |
When I proposed #246, I wanted to stay away from a first-class Writing in a sort of constructor-type pesudocode, and starting with the original atomic unit of Ragna, the
Here, the |
I've made a start on this over here. My first attempt is to just extend the I'll come back to working on this later in the week. Feedback very welcome! |
Most of my work has been hacking around the edges of Ragna rather than hacking on Ragna's code base directly. In doing so, I've written two scripts (both of which I'm happy to share with the community if there is interest):
--- AND ---
After writing these scripts, I realized that they're all I need for "managed" Ragna. Could we promote these function into the UI? That is, what if we put a delete button and a clone button under each listed chat? Clicking the clone button would pop up the chat creation dialog, which would be prepopulated with all the information from the chat being cloned. A user could then modify the information to create a new chat based off of the old chat. Does this seem like a more doable plan to getting to "managed" Ragna than adding a [NOTE: I will stipulate that the |
Re delete: This was originally discussed in #62. We ended up merging #67, which added a delete endpoint to the API, which is likely what you are using in your script: ragna/ragna/deploy/_api/core.py Lines 280 to 283 in 2b32fdd
We always had the plan to add the delete button to the UI as well, but never got around to actually implement it. Let me open an issue for this to track it. Edit: See #304. Re clone: I agree with you that cloning a chat is the inferior solution to using a In the mean time I think the scripts you wrote can still be beneficial for the community. Do you want to start a "Show and tell" discussion for them? |
Posting to "Show and tell" might be a little unwieldy as I have four scripts and one common code file. I could do a pull request to add them to: https://github.com/Quansight/ragna/tree/main/scripts I do think they would be pretty useful and I'm happy to contribute them. Let me know what you think. |
Sure, a PR works as well. I never meant for you to post all the code in the discussion itself, but rather create a discussion and link the code from there. Like you said, these scripts can be useful for the community and I don't want them "buried" in this thread here. |
I've spent a good deal of time thinking about this over the past few weeks and I think I found a proper solution that I'll propose here. This is going to be a long post, but there is no way around it. If the proposal is accepted, we can break it up into smaller issues for tracking. I've ditched the corpus abstraction and thus my idea from #256 (comment) in favor of a simpler but just as powerful workflow. The overall idea goes as follows:
Let's dive deep into the proposal now. Python APITo be able to store additional document metadata, e.g. tags (#246), inside the source storage, we first need a way to put it on the Lines 27 to 37 in 1758816
The constructor is usually not called directly, but rather through Lines 93 to 94 in 1758816
Meaning, passing additional metadata is as easy as LocalDocument.from_path("/path/to/my/document.txt", metadata={"tag": "important"}) To store them alongside the embedding inside the ragna/ragna/source_storages/_chroma.py Lines 59 to 65 in 1758816
With that in place we can use the native functionality to filter on this metadata, e.g. However, while all components have similar support for metadata filtering they all use a different "dialect". Meaning we need to provide a layer of abstraction on top of it. This should be relatively straight forward tree-like object that supports a few basic operators like "and", "or", and "equals". We can always add more later. Instead of With this we have the functionality down, but there are still two open questions:
The current relevant part interface consists of these parts:
Both parts can not stay like this if we want to support "managed" Ragna. To overcome this, I suggest replacing the
REST APIThe REST API closely follows the Python API and thus we don't need to change a lot here:
Web UIThis section relies on #313. TL;DR: the "new chat" modal should be be cleanly split into two "phases", namely "preparation" and "interrogation" (happy for other naming suggestions). For the remainder, I'll assume that we have a good UI design and will only deal with the background functionality here. Right now the functionality of these two phases is hardcoded:
Instead of hardcoding this behavior, which clashes with the ideas presented above, we could create two new abstract objects, e.g.
With this we can retain our current behavior by providing default implementations for the two objects. At the same time this leaves admins with the ability to specify the workflow they want for their users. FeedbackAlthough I carefully considered all known use cases right now, I might have missed something or simply made a mistake. If you have a use case that will break the design above or you notice an inconsistency in it, feel free to speak up. Implementing the proposal will be a major effort. I would hate to only find about some blockers when we are already deep in the weeds. |
Thanks for thinking so thoroughly about the problem and posting such a detailed response. I'm going to have to read it a few times to fully understand everything and make sure what you suggest aligns with our codebase. We've actually written a fair bit of code to the existing interfaces, so I'll be honest that I'm a little nervous about the proposed changes. I hope you will hold off on implementation until we can provide you with some feedback. Cheers! |
I'd love to see or hear about all workarounds you have in place to see if this is something Ragna would benefit from in general. If you cannot post this publicly, I'm available at The only breaking change in my proposal is how the source storage stores and retrieves data. Right now, the builtin
I don't know how you have set up your What we need for this proposal
There is no rush yet. We have a deadline for early April, but we are not going to start on this before the |
@pmeier Thanks for the write-up. 😍 I have some questions on a couple of the topics:
Does this mean that it's not possible to create multiple tables for the same embedding model at all? If I have multiple pre-existing tables that I would like to chat over in separate conversations, is this no longer possible? I think this is quite restrictive if so (but perhaps I have misunderstood).
This sounds like a lot of work (including continuous maintenance). It also seems possible that we will not be able to support the full-functionality of every
I'm getting into the weeds a bit here (and perhaps I have not understood correctly), but I don't find this behaviour intuitive. Isn't it also possible that I would like to pass a metadata filter object and 'prepare' the data at the same time? I know you have mentioned something about this in a later paragraph, but I don't think it covers all possibilities - what if I add some documents, and want to use a filter over a larger superset of documents already exisiting in a table? |
Yes, but only for the builtin source storages. For custom ones, users can do whatever they want. If they have a use case for multiple tables, their custom
This is currently also not possible for the builtin source storages and thus no regression.
I'm happy to be enlightened here, but I don't think this is easy or doable. How would you define a "search string" for the following two scenarios:
Here is an example for the first scenario for a ids = [...]
filter = MetadataFilter.or_([MetadataFilter.eq("id", id) for id in ids])
print(filter)
And here is the filter translated into the Chroma dialect:
Yes, certainly. We need to decide at some point if we are ok to add more advanced operators to the metadata filter that only some source storages support. I would start with a minimal set as proposed above and only expand in case the use case arises. However, this is exactly the same for a "search string". Unless I misunderstand what you are saying, a "search string" would have exactly the same features and limitations as a
Hmm, I'm a little on the fence for this. We could potentially also support your use case by allowing both a The reason I'm not sure is because this is mixing two use cases. Adding documents to an existing index is an admin task, while filtering / chatting is a user task. I'll think about this more. |
Here is my sample implementation of a |
Why can't we just pass a string that contains the query according to the (Sorry, not grokking this...) |
Two reasons:
|
Summarizing an offline discussion I had with @nenb: His concerns regarding the I agree that these are valid concerns. However, not providing an abstraction here would mean that the chat gets less "portable", i.e. we now have dependent parameters were they are currently independent. Plus, and this is the strongest counter argument for me, we would push learning the correct filter dialect for the source storage to the user. That is probably ok for custom source storages, but a quite bad UX for the builtin ones. Unless the mismatch cannot be justified, I value UX higher than DevX. For me, maintaining a fairly standard tree implementation plus a few translators is preferable over telling users to do it. |
I've read this proposal over and reflected on it. I guess my concern is that I don't know if every Beyond that, I doubt I'll have the time or inclination to update old I certainly don't want to stand in the way of the project advancing, and I do think that "managed" Ragna is a great feature. My instinct is to say that the proposed solution is a little too "fragile" for my tastes. Another thing to think about is the ramifications of commingling documents in the same underlying index. Right now, I can pretty easily find and delete a large index backing a deleted chat. It seems like in this model the single collection / table / index grows unbounded. Is that correct? Anyway, I'm happy to discuss more either online or offline. |
That is a valid concern. This coincides with another use case I haven't touched on above: what if I don't want to do any filtering, but rather want to use the full index? Let's enumerate the four possible cases on how to create a new chat assuming we can pass both
I've used To come back to the use case of the Of course the implementation of the
I feel you, but unfortunately this is the harsh reality of beta products ... The only thing that breaks existing ragna/ragna/source_storages/_chroma.py Line 126 in c1c159d
ragna/ragna/source_storages/_lancedb.py Line 111 in c1c159d
For our builtin ragna/ragna/core/_components.py Line 93 in c1c159d
We probably need to refactor this to only store the document ID and metadata there rather than the actual So indeed this is a breaking change. The only upside is that we can achieve the same behavior as before with a few changes. Whether or not this is worth it in your case is not for me to judge.
Could you elaborate on what is "too fragile" here. Do you mean anything but the fact that some Does the point that @nenb raised in #256 (comment) that our custom
Yes, that is correct. My only argument here is that "managed" Ragna basically creates two roles: admin and user. Users would lose the ability to store new documents and thus the index can only grow by admin action. Also note that for your custom implementations, you don't need to follow the paradigm of one index only. You can happily create indices based on the
I would love too. I think getting some insights in how you have set up your |
I'm going to keep commenting in public, but if I feel that I need to provide more depth, I'll contact you, @pmeier, and we can talk offline. First, let me address this part of your reply above:
I think, technically, I mean that the proposed solution has a "code smell" known as Control Coupling. If we're fully aware of this coupling and its ramifications, and it is still the best option, I suppose I can't object to its implementation as I'm not writing the code. However, my gut instinct is that this kind of coupling is going to introduce unforeseen maintenance and modification costs. Second, let me expound upon this part of your reply:
I've implemented "managed" Ragna through my scripts and specialty classes that kind of act as soft links (see example below). More specifically, say I have a corpus of With this class WikipediaVectara(Vectara):
def store(
self,
documents: list[Document],
*,
chat_id: uuid.UUID,
chunk_size: int = 500,
chunk_overlap: int = 250
) -> None:
pass
def retrieve(
self,
documents: list[Document],
prompt: str,
*,
chat_id: uuid.UUID,
chunk_size: int = 500,
num_tokens: int = 4000
) -> list[Source]:
return super().retrieve(
documents,
prompt,
chat_id=uuid.UUID("f5eeb599-46f8-4e9e-8753-3a1d2b80b26f"),
chunk_size=chunk_size,
num_tokens=num_tokens
) Now, I can use my ./clone.py \
--src-id "f5eeb599-46f8-4e9e-8753-3a1d2b80b26f" \
--dest-name "New Chat" \
--dest-assistant "Bedrock/anthropic.claude-v2" \
--dest-source-storage "WikipediaVectara" Maybe there is some way we can operationalize my evolved solution as I know it works and it is non-breaking with respect to existing interfaces. |
It's been roughly two months since the initial release of Ragna. While the overall feedback is positive, we had feature requests (I'll link stuff below) for a "managed" version of Ragna. By "managed" I don't mean managed by the Ragna team as a paid service, but rather managed by an admin team inside an organization for regular users inside the same organization. If someone has a better name for this, please come forward.
Let's have a look at how Ragna currently works and why this is the case, what exactly is requested, and what ultimately needs to change to make these requests a reality.
Status quo
Ragnas original goal was to make the research case, i.e. evaluating all different RAG components, easy. We made the following assumptions:
With that we made the chat the "atomic unit" of Ragna. Each chat can be about different documents and all components can be configured freely.
What is missing?
As explained above, the use case that we build Ragna for requires a large degree of agency of the user. In an actual production setting, the assumptions we made above might not hold:
Thus, Ragna currently falls flat for this use case or at least makes it difficult. This is evident by bunch of feature requests and comments we received. I'll try to link everything here. If I missed something, feel free to follow up on this comment.
Basically, this uncouples the
Chat.prepare
andChat.answer
methods. The former will be executed by the administrator and the latter by the users.Cc @peachkeel, @nenb, @NetDevAutomate who contributed the requests linked above.
What needs to change?
There are plenty of changes needed. I'll put my thoughts below without any claim that the list is exhaustive. There might be other changes needed that we only find when start proper planning. The list below is meant as a starting point.
Python / REST API
Assistant
class can stay as it is, we need to make some changes to theSourceStorage
class.SourceStorage.store
is used in the preparation stage, whileSourceStorage.retrieve
is used in the answer stage. At the very least, we need to switch away from storing each chat in an individual collection / table / etc to doing this on a corpus level. Furthermore, we also need a way to select a subset of documents from a corpus for example through tagging.ManagedChat
(name TBD) is needed. Instead of documents, this takes a corpus and optionally tags for a sub-selection of documents. The newManagedChat
has no notion of preparation as the corpus is already prepared.SourceStorage
.Web UI
How do we move forward?
This is a massive feature request since it breaks some core assumptions that we made. We agree that this is useful and would be beneficial to support. That being said, we (Ragna core team funded by Quansight) currently lack the resources to tackle this holistically. Given that there are so many design decisions to be made, I think this will be almost impossible by relying on OSS contributions alone. At the very least we need to have some intense sprint planning before we can ask the community for help.
If you have a corporate interest in this feature being added to Ragna, please reach out to Quansight, either by contact form or email.
The text was updated successfully, but these errors were encountered: