-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trigger path concatenator from within the matcher/merger subsystem #2527
Comments
This is a non-trivial refactor, so I'll put in some of my thinking now so I don't have to relearn it later.
Perhaps this can be better expressed without being a separate stage, and instead being run as part of the merger? |
It needs to be a (sub)stage after merger, as it relies on the right data being in the merged database. |
There is a minor conflict between efficiency and purity here. The current incarnation knows that there is a collectionPath in each record it messes with, so it can notify the relationEmbedder via the pathSender. However, I want us to be able to describe each full stage (either individual standalone apps, or a subsystem like relation_embedder or matcher_merger) as accepting and sending work ids. As a result, the concatenator, as the final stage within the matcher_merger subsystem, should notify downstream with work ids, which will then be used by the merger to retrieve the work and notify the next stage in its own subsystem with a path. This inefficiency is pretty minor, and it's better to be clear about boundaries. |
I think it also needs to accept a work id. That way it knows what to notify downstream about if nothing changes. |
Path concatenator is currently in the relation_embedder subsystem, but this is the wrong place for it as it writes to works-merged, so it belongs in the matcher/merger subsystem.
It could be triggered as part of the sendWorkOrImage function in the merger.
The text was updated successfully, but these errors were encountered: