Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enh: remove viz action and replace by viz content and system prompt #6599

Merged
merged 3 commits into from
Aug 1, 2024

Conversation

fontanierh
Copy link
Contributor

@fontanierh fontanierh commented Jul 31, 2024

Description

The current implementation of visualization uses a dust app which is responsible for generating the code.
The dust app generates event that are interpreted by our frontend.

This approach has some downsides:

  • a bunch of new events that need to be handled
  • the sequence of messages is "un-natural" for the LLM, as it requires doing a function call with no arguments first to enter "visualization" mode
  • the main "agent loop" sometimes wants to iterate on visualizations or generate new ones, but that doesn't work because we only interpret the viz when it is being generated from the viz dust app (the model thinks it can do that, because it sees the previously generated code as if he generated it)
  • the model tends to over-generate, as it wants to reply to the user from the message generated in the dust app. We extract the code and ignore that content, but the model then needs to generate the response again.
  • the model sees the generated code as a function result (which is weird), so it wants to repeat that code to the user. We need a lot of prompting to avoid that

This change removes the dust app and the viz action entirely. Instead, viz is now a flag on the agent configuration model. If this flag is true:

  • we inject a system prompt to let the model know it can use viz, and explain how it can do so (this prompt used to be part of the dust app)
  • we inject a model delimiter config to parse the <visualization> tag in the model output. Tokens generated between these tags are classified as visualization

This also adds a new visualizations field on the agent message (so you have the viz already parsed when you come back to a conversation that has viz)

The system prompt is now in the codebase. We also inject the list of available files in that prompt (used to be done right before calling the dust app)

Note: I have not done the assistant builder part yet. there's an open question regarding how we expose that (do we masquerade it as a tool ? or have a checkbox somewhere ?)

Risk

Shouldn't impact any prod features. Still there is a DB migration and some changes to the file resource to support batch get (removed a promise.all)

Deploy Plan

  • Apply migration (prodbox)
  • deploy code

@fontanierh fontanierh force-pushed the enh/refactor-delimiters-parsing branch 2 times, most recently from f07c50a to 5baa734 Compare July 31, 2024 15:50
Copy link

github-actions bot commented Jul 31, 2024

Warnings
⚠️

Files in **/lib/models/ have been modified and the PR has the migration-ack label. Don't forget to run the migration from prodbox.

Generated by 🚫 dangerJS against 5cf0bce

@fontanierh fontanierh changed the title WIP enh: refact agent msg content parsing and add visualization token class enh: remove viz action and replace by viz content and system prompt Jul 31, 2024
@fontanierh fontanierh requested a review from flvndvd July 31, 2024 17:08
Copy link
Contributor

@flvndvd flvndvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks way better 🔥 , left a few comments. Also I'd challenge the overfit around visualization to accommodate for more usecases like this in the future.

Comment on lines 224 to 231
const lastViz = v[v.length - 1];
if (lastViz) {
return [
...v.slice(0, v.length - 1),
{ ...lastViz, complete: true },
];
}
return v;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const lastViz = v[v.length - 1];
if (lastViz) {
return [
...v.slice(0, v.length - 1),
{ ...lastViz, complete: true },
];
}
return v;
v.map((item, index) =>
index === v.length - 1 ? { ...item, complete: true } : item
);

Copy link
Contributor Author

@fontanierh fontanierh Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair

@@ -86,7 +77,7 @@ function useVisualizationDataHandler(
if (
!isVisualizationRPCRequest(data) ||
!isOriginatingFromViz ||
data.actionId !== action.id
data.index !== visualization.index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work if you have two visualizations displayed on two different messages? We might have to prefix the index with the agent message to not process events multiple times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, we should prefix

streamedCode: string | null;
isStreaming: boolean;
visualization: {
visualization: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create an interface for this type to avoid duplicating 🙏 ?

className="transition-height relative min-h-96 w-full overflow-hidden duration-500 ease-in-out"
className={classNames(
"transition-height relative w-full overflow-hidden duration-500 ease-in-out",
codeFullyGenerated ? "min-h-96" : ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC changing style like this won't rely on the CSS height animation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice a change but maybe you are right, I'm going to check. The main thing is I wanted to avoid the empty white space below the code block at the beginning of the generation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep def makes sense to remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the animation still works fine, because the style is applied before the iframe gets rendered, so when it needs to grow / shrink to switch to iframe content, the min-h-96 is already here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

@@ -1502,6 +1504,7 @@ export async function* retryAgentMessage(
actions: [],
content: null,
chainOfThought: null,
visualizations: [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we should create a more generic solution that can support capabilities beyond just visualization. I don't have specific ideas for other capabilities that might not need a dedicated action implementation right now, but I'm confident we will face this need in the future. Maybe we could have a dedicated table linking an AgentMessageId with the capability type and its result. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the point. For now I feel a bit antsy about creating an abstraction I'm not yet sure we'll really need

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't fully understand the separate table thing ? this is all stored in AgentMessageContent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(which is already a separate table -- we don't store the extracted viz directly)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I missed the AgentMessageContent, which is good then wanted to make sure it does not live in the agent_messages table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's just a field on the agent message type (as in output / API type) for convenience (so consumers of the API don't need to do the content parsing themselves)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that boolean values in SQL tables don't scale well, and we don't want to have many columns with this type. Hence, the proposal to simply use an array of strings instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah for the visualizationEnabled part it's another topic, array of strings could be better. I just wonder if we'll ever need to have other strings in that array. IMO it might be better to keep it as a boolean for now and only migrate to array of strings once we have the use-case for a second boolean

@@ -502,3 +504,80 @@ function getTextContentFromMessage(
})
.join("\n");
}

async function getVisualizationPrompt({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit Can we still have a dedicated file for visualization logic?

@@ -0,0 +1,263 @@
export const visualizationSystemPrompt = `
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

@@ -122,6 +123,10 @@ AgentConfiguration.init(
type: DataTypes.INTEGER,
allowNull: true,
},
visualizationEnabled: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we store an array of string representing capabilities that don't require an action?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried we're going to end up with an array for only one capability. I think we should wait until we need it before migrating

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, it's up to you. It seems like it would save time to do it now, but I suppose you'll be the one to run this migration when the time comes 😛.

Comment on lines 72 to 75
const owner = auth.workspace();
if (!owner) {
throw new Error("Unexpected unauthenticated call to `fetchByIds`");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const owner = auth.workspace();
if (!owner) {
throw new Error("Unexpected unauthenticated call to `fetchByIds`");
}
const owner = auth.getNonNullableWorkspace();

@fontanierh fontanierh force-pushed the enh/refactor-delimiters-parsing branch from b148629 to e12e68d Compare August 1, 2024 10:08
@fontanierh fontanierh force-pushed the enh/refactor-delimiters-parsing branch from e12e68d to ead44c0 Compare August 1, 2024 10:08
Copy link
Contributor

@flvndvd flvndvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@@ -122,6 +123,10 @@ AgentConfiguration.init(
type: DataTypes.INTEGER,
allowNull: true,
},
visualizationEnabled: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, it's up to you. It seems like it would save time to do it now, but I suppose you'll be the one to run this migration when the time comes 😛.

@@ -122,6 +123,10 @@ AgentConfiguration.init(
type: DataTypes.INTEGER,
allowNull: true,
},
visualizationEnabled: {
type: DataTypes.BOOLEAN,
allowNull: false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to set the default value here as well? Cause I see in the migration but not here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we do, in the end it doesn't matter since it's in the migration (it will backfill), but explicit passing still required at typescript level.
But will add it here as well to be safe 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure locally people run migrations, so in case they init through Sequelize, let's make sure it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed

@fontanierh fontanierh added the migration-ack 📁 Label to acknowledge that a migration is required. label Aug 1, 2024
@fontanierh fontanierh merged commit 441519d into main Aug 1, 2024
5 checks passed
@fontanierh fontanierh deleted the enh/refactor-delimiters-parsing branch August 1, 2024 12:26
albandum pushed a commit that referenced this pull request Aug 28, 2024
…6599)

* enh: remove viz action and replace by viz content and system prompt

* review

* review2

---------

Co-authored-by: Henry Fontanier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migration-ack 📁 Label to acknowledge that a migration is required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants