-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly #677
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think we are okay with the Jellyfish migration/deprecation plan.
return | ||
// Allow closing datasets without deduplicator (i.e. those created by jellyfish) | ||
// so we can mark summaries as outdated. | ||
if dataSet != nil && dataSet.HasDeduplicatorName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding things correctly, I don't think this will cause issues with the current Jellyfish migration/deprecation plan. The plan does not currently include any updates to the data set (just data). Thus, a Jellyfish data set will not have a deduplicator name and so this code will not be executed.
If, however, we do need to update the data set with the deduplicator name, then this will cause a couple of potential issues:
- The Jellyfish data set and data may include some fields normally only present in Platform data (written by
DataRepository.UpdateDataSet
andDataRepository.ActiveDataSetData
). We'd need to do a survey of exactly what fields are added by Jellyfish or Platform and compare the two. Even if there are field differences they may not cause any real issue, just inconsistencies. - The Platform code that performs data duplication will execute on Jellyfish data (where other Jellyfish data is duplicated based upon this Jellyfish data using the Platform deduplication mechanism). I don't think this will cause an issue, though, since Jellyfish will have already deduplicated the data (by dropping any new duplicates) prior to closing of the Jellyfish data set. Thus, I think that this data deduplication step would end up just being a NOP.
In either case, we should verify this definitely won't cause issues and add it as one or more test cases, but at first glance there doesn't appear to be any (or if there are some they would be minor).
cc: @jh-bate
/query qa5 |
/query qa2 |
Currently, jellyfish doesn't close legacy datasets after upload completion. Due to this, we can't detect when the upload has finished.
When a new batch is uploaded, jellyfish marks the summary as oudated with a 2 minute buffer, meaning the summary will be recalculated after 2 minutes. If an upload doesn't finish within this timeframe, the summary may be calculated mid-upload and may not be accurate.
In order to fix the issue above and to reduce the time required to recalculate a summary after an upload has finished and the subsequent push of updated reports to the EHR, this PR allows Uploader to "close" jellyfish datasets (which currently don't have a deduplicator) so summaries can be recalculated right after upload. This is important for in-clinic uploads.
This may have some unexpected side-effects when tidepool-org/jellyfish#195 is merged (e.g. running the deduplicator on jellyfish datasets).