-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
github - add new workflows with the folders upsert #9562
Conversation
) { | ||
await githubSaveStartSyncActivity(dataSourceConfig); | ||
|
||
const queue = new PQueue({ concurrency: MAX_CONCURRENT_REPO_SYNC_WORKFLOWS }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why we don't use the concurrentExecutor
here?
If the reason was to not change the workflow code, now is a good time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea why was #3379 reverted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- workflows are executed by temporal servers, we use as little libs as possible since it can break it (but maybe it'd work
- I remember somebody (flavien I think) implemented, then went back, then implemented again but couldn't say why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT concurrentExecutor is only a thin layer on top of p-queue so that's really be the same here, only less code and we'd be using the cool stuff other eng made for the sake of our happiness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but well, as seen IRL let's not bother with this, we'll deal with this separately if we want to (I'm not sure I really really do :p)
@@ -68,7 +68,7 @@ export async function launchGithubFullSyncWorkflow({ | |||
return; | |||
} | |||
|
|||
await client.workflow.start(githubFullSyncWorkflow, { | |||
await client.workflow.start(githubSyncAllReposWorkflow, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should keep the name "full sync" that's commonly used across the eng team to refer to... full syncs
we could go githubFullSync
, ghFullSyncWorkflow, fullSyncWorkflow which are used elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the record, even if it's a long shot I am in favor of slowly phasing out the terminology FullSync when appropriate as there can be mixups between fullSync === !incrementalSync and fullSync === syncs everything (not true for some connectors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this file since we have some other workflows untouched that start with github
and end with workflow
I want to keep that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do v2 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great 👍 Minor coms
Thanks for handling this 😱 situation
@@ -159,6 +230,51 @@ export async function githubReposSyncWorkflow( | |||
await githubSaveSuccessSyncActivity(dataSourceConfig); | |||
} | |||
|
|||
export async function githubSyncReposWorkflow( | |||
dataSourceConfig: DataSourceConfig, | |||
connectorId: ModelId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we say we need to copy if the workflow does not change, but only the child workflow?
i don't recall (but I'd support if we're unsure 👍 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as seen IRL, let's not risk it, duplicating it is safe with no cost
* add new workflows that are duplicates of the old ones with the new upsert activities * fix backfill script * rename workflows with v2 * 📝
Description
old
/new
):githubFullSyncWorkflow
/githubSyncAllReposWorkflow
fetches the repositories and spawnsgithubRepoSyncWorkflow
/githubSyncRepoWorkflow
.githubRepoSyncWorkflow
/githubSyncRepoWorkflow
spawnsgithubRepoIssuesSyncWorkflow
/githubSyncRepoIssuesWorkflow
andgithubRepoDiscussionsSyncWorkflow
/githubSyncRepoDiscussionsWorkflow
and then runsgithubCodeSyncActivity
.githubCodeSyncActivity
upserts the "Code" folder itself (this activity has the same level of responsibility than the 2 workflows for the discussions and issues).githubSyncRepoIssuesWorkflow
runs the activity to upsert the "Issues" folder (only diff withgithubRepoIssuesSyncWorkflow
).githubSyncRepoDiscussionsWorkflow
runs the activity to upsert the "Discussions" folder (only diff withgithubRepoDiscussionsSyncWorkflow
).Risk
Deploy Plan
20241219_backfill_github_folders
.