Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[connectors] feature: Update fullsync with signals to avoid complete resync #7494

Merged
merged 7 commits into from
Sep 24, 2024

Conversation

tdraier
Copy link
Contributor

@tdraier tdraier commented Sep 18, 2024

fixes: #6368

Description

Starts fullSync with signals allowing to add or remove nodes from the current sync.
Based on #6711

Risk

may break microsoft fullsync, can be rollbacked

Deploy Plan

deploy connectors

Copy link
Contributor

@philipperolet philipperolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 not sure I get the use of signals here, let's talk irl

@@ -276,7 +283,8 @@ export async function markNodeAsSeen(connectorId: ModelId, internalId: string) {
);

if (!node) {
throw new Error(`Node ${internalId} not found`);
logger.warn(`Node ${internalId} not found`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 coms:

  • sure we wanna switch from breaking to logging?
  • would log a bit more in that case, at least the connector id; using e.g. a childLogger if you're familiar with the pattern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if you remove an item while doing a fullsync , since we don't stop anymore the workflow, it's normal that we fall into this case. The node should just be skipped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok clear ! would there be a simple way to detect that? skip if not found because item removed, otherwise throw?
Also, is it fine with you to update the logging?

@@ -310,7 +318,11 @@ export async function syncFiles({
);

if (!parent) {
throw new Error(`Unexpected: parent node not found: ${parentInternalId}`);
logger.warn(`Unexpected: parent node not found: ${parentInternalId}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same com as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same same :-)

@@ -363,9 +363,13 @@ export class MicrosoftConnectorManager extends BaseConnectorManager<null> {
internalId: id,
}));

await MicrosoftRootResource.batchMakeNew(newResourcesBlobs);
const addedResources =
await MicrosoftRootResource.batchMakeNew(newResourcesBlobs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we just change the front end logic so here we only pass the 'changed' permissions, wouldn't that and the activities.ts changes below be enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taking care of removed folders too ofc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal is handled same way as before (deletion workflow is launched before starting new workflow/sending signal)

if (nodeIdsToSync === undefined) {
nodeIdsToSync = await getRootNodesToSync(connectorId);
}
setHandler(folderUpdatesSignal, (folderUpdates: FolderUpdatesSignal[]) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I get why we add signal handling to the workflow; can't see where signals are sent and why they are needed VS sending the permission diff to setPermissions when doing add/remove?
Will ask you IRL so I better understand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using signalWithStart , which starts the workflow if it's not running, or send a signal if it does.

Copy link
Contributor

@philipperolet philipperolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the help understanding! Getting there, still have a few coms; happy to discuss
Also we had talked about handling the "multiple click" case in add / remove for which you apparently had a good, quick solution in mind? worth adding IMO

// If we get a signal, update the workflow state by adding/removing folder ids.
for (const { action, folderId } of folderUpdates) {
switch (action) {
case "added":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have multiple questions here:

  • populateDeltas not executed for added node?
  • if node is added during the syncSucceed activity, node will be skipped?
  • if node is removed while it was already traversed by the while loop, then it will splice(-1,1) which will remove the last element? (not what we'd want)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe on restart we could somehow restart the workflow with the right nodeIdsToSync, so it would avoid issues above + the need to skip of previous comments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if folders were already synced how are they removed? I think you had told me but can't remember

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • populateDeltas is called directly in setPermissions. I don't know what's the use of the 2nd populateDelta in the workflow - that one will indeed not be called when adding nodes.
  • if node is added during the syncSucceed : I've added a check after the syncSucceeded, in case it happens
  • fixed the splice. on the other hand, I notice that "removed" signal is never sent ... I implemented it in a first version but finally only rely on the removal done in setPermissions(). btw i think we have the same issue with google connector. We should probably get rid of the microsoftDeletionWorkflow and handle all deletion inside the fullSync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to include microsoftDeletionWorkflow inside fullSync. This also has the benefits of avoiding "parent node not found" logs, as it's done in sequence in the same workflow. However I think it's better to keep logs and not throw as a "missing parent" is not something recoverable by retrying the activity, and maight still happen with incremental sync running at the same time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok but in that case with panic flag set to true right?
Because throwing, by making the activity retry, was triggering a monitor if the issue wasn't transient, that was prompting us to act; this is why I was inclined to keep the throw

@@ -276,7 +283,8 @@ export async function markNodeAsSeen(connectorId: ModelId, internalId: string) {
);

if (!node) {
throw new Error(`Node ${internalId} not found`);
logger.warn(`Node ${internalId} not found`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok clear ! would there be a simple way to detect that? skip if not found because item removed, otherwise throw?
Also, is it fine with you to update the logging?

Copy link
Contributor

@philipperolet philipperolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Left a final com, but approving already. Thanks 🙏

Copy link
Contributor

@philipperolet philipperolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 😌

@tdraier tdraier merged commit cce1d46 into main Sep 24, 2024
3 checks passed
@tdraier tdraier deleted the microsoft-update branch September 24, 2024 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[MS Connector] [Post-ship stabilization] Selective fullsync on add/remove data
2 participants