-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetch data source view in agent configuration #6708
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think adding a DB roundtrip here is desirable, we use this function a lot and latency needs to stay as low as possible
agentTablesConfigurationTables, | ||
dustApps, | ||
] = await Promise.all([ | ||
retrievalDatasourceConfigurationsPromise, | ||
processDatasourceConfigurationsPromise, | ||
fetchAgentRetrievalConfigurationsActions({ configurationIds, variant }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it ConfigurationActions
? Do you mean ActionConfigurations
?
`Couldn't find topK for retrieval configuration ${retrievalConfig.id}} with 'custom' topK mode` | ||
); | ||
} | ||
const retrievalActions = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are not actions, they are action configurations (naming is quite confusing IMO)
import { DataSourceViewResource } from "@app/lib/resources/data_source_view_resource"; | ||
import { DataSourceViewModel } from "@app/lib/resources/storage/models/data_source_view"; | ||
|
||
export async function fetchAgentProcessConfigurationsActions({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will add latency. Before, we were getting the action config with the rest of action configs, and had a second step for all the "secondary" resources linked to action configs. Now we have an extra step
Seen IRL, I'll just go all the way in and fully migrate all action configurations to this pattern to maintain a consistent latency. Henry's concern is not about the additional database calls but rather the fact that these two functions are slightly slower compared to the others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, much nicer like this
let's make sure it's thoroughly tested because it's very easy to break this function by mistake 🙏
@@ -37,6 +37,12 @@ import { | |||
DEFAULT_TABLES_QUERY_ACTION_NAME, | |||
DEFAULT_WEBSEARCH_ACTION_NAME, | |||
} from "@app/lib/api/assistant/actions/names"; | |||
import { fetchBrowseActionsConfigurations } from "@app/lib/api/assistant/configuration/browse"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the double plural here feels a little weird to me, I would rather fetchBrowseActionConfigurations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, updated.
configurationIds: ModelId[]; | ||
variant: "light" | "full"; | ||
}): Promise<Map<ModelId, BrowseConfigurationType[]>> { | ||
if (variant !== "full") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I feel it would be better to not pass variant to these functions, and just handle that in parent (i.e, not call the function at all if variant is not full)
But I can live with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will give it some more love in a follow up PR.
Tested with all the dust apps locally (except table query). |
* Fetch data source view in agent configuration * Setup Sequelize relationship * Return both data source id and data source view id. * Refactor process action * Address naming comments from review * Refactor dust app fetching logic * Refactor table query actions fetch * Refactor websearch and browse fetch logic * ✨ * ✨
Description
This PR refactors the fetch agent configuration. Now, we fetch all data sources and any associated data source views for both retrieval and process action configurations. This change is necessary to pass these data source views to the core when performing any of these actions. Data source views are essential for conditioning access to a data source when reading documents from it. The code that calls core will be modified in a subsequent PR.
Additionally, I started splitting the fetch agent configuration code by action, with each file corresponding to a specific action’s fetch logic. Parallelization is still maintained. To keep the same "latency" we should move all the actions fetching logic to this pattern.
I left several TODOs since the UI is not yet aware of data source views. However, considering that the types are shared for both posting a configuration and fetching, this PR will break the builder UI for users currently on it. Given the current load, this should be manageable.
Risk
Safe to rollback.
Deploy Plan
No SQL migration, only some stuff to please Sequelize.