Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly #677

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

toddkazakov
Copy link
Contributor

@toddkazakov toddkazakov commented Oct 27, 2023

Currently, jellyfish doesn't close legacy datasets after upload completion. Due to this, we can't detect when the upload has finished.

When a new batch is uploaded, jellyfish marks the summary as oudated with a 2 minute buffer, meaning the summary will be recalculated after 2 minutes. If an upload doesn't finish within this timeframe, the summary may be calculated mid-upload and may not be accurate.

In order to fix the issue above and to reduce the time required to recalculate a summary after an upload has finished and the subsequent push of updated reports to the EHR, this PR allows Uploader to "close" jellyfish datasets (which currently don't have a deduplicator) so summaries can be recalculated right after upload. This is important for in-clinic uploads.

This may have some unexpected side-effects when tidepool-org/jellyfish#195 is merged (e.g. running the deduplicator on jellyfish datasets).

@toddkazakov toddkazakov changed the title [BACK-2702] Allow closing datasets created by jellyfish we can recalculate summaries correctly [BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly Oct 30, 2023
darinkrauss
darinkrauss previously approved these changes Oct 30, 2023
Copy link
Contributor

@darinkrauss darinkrauss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think we are okay with the Jellyfish migration/deprecation plan.

return
// Allow closing datasets without deduplicator (i.e. those created by jellyfish)
// so we can mark summaries as outdated.
if dataSet != nil && dataSet.HasDeduplicatorName() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding things correctly, I don't think this will cause issues with the current Jellyfish migration/deprecation plan. The plan does not currently include any updates to the data set (just data). Thus, a Jellyfish data set will not have a deduplicator name and so this code will not be executed.

If, however, we do need to update the data set with the deduplicator name, then this will cause a couple of potential issues:

  1. The Jellyfish data set and data may include some fields normally only present in Platform data (written by DataRepository.UpdateDataSet and DataRepository.ActiveDataSetData). We'd need to do a survey of exactly what fields are added by Jellyfish or Platform and compare the two. Even if there are field differences they may not cause any real issue, just inconsistencies.
  2. The Platform code that performs data duplication will execute on Jellyfish data (where other Jellyfish data is duplicated based upon this Jellyfish data using the Platform deduplication mechanism). I don't think this will cause an issue, though, since Jellyfish will have already deduplicated the data (by dropping any new duplicates) prior to closing of the Jellyfish data set. Thus, I think that this data deduplication step would end up just being a NOP.

In either case, we should verify this definitely won't cause issues and add it as one or more test cases, but at first glance there doesn't appear to be any (or if there are some they would be minor).

cc: @jh-bate

@Roukoswarf Roukoswarf changed the base branch from ehr-triggers-spike to master November 6, 2023 04:18
@Roukoswarf Roukoswarf dismissed darinkrauss’s stale review November 6, 2023 04:18

The base branch was changed.

@Roukoswarf Roukoswarf changed the base branch from master to ehr-triggers-spike November 6, 2023 04:19
Base automatically changed from ehr-triggers-spike to master November 16, 2023 18:03
@toddkazakov
Copy link
Contributor Author

/query qa5

@toddkazakov
Copy link
Contributor Author

/query qa2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants