[BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly #677

toddkazakov · 2023-10-27T16:17:14Z

Currently, jellyfish doesn't close legacy datasets after upload completion. Due to this, we can't detect when the upload has finished.

When a new batch is uploaded, jellyfish marks the summary as oudated with a 2 minute buffer, meaning the summary will be recalculated after 2 minutes. If an upload doesn't finish within this timeframe, the summary may be calculated mid-upload and may not be accurate.

In order to fix the issue above and to reduce the time required to recalculate a summary after an upload has finished and the subsequent push of updated reports to the EHR, this PR allows Uploader to "close" jellyfish datasets (which currently don't have a deduplicator) so summaries can be recalculated right after upload. This is important for in-clinic uploads.

This may have some unexpected side-effects when tidepool-org/jellyfish#195 is merged (e.g. running the deduplicator on jellyfish datasets).

…ies correctly

darinkrauss

LGTM. I think we are okay with the Jellyfish migration/deprecation plan.

darinkrauss · 2023-10-30T17:04:55Z

data/service/api/v1/datasets_update.go

-			return
+		// Allow closing datasets without deduplicator (i.e. those created by jellyfish)
+		// so we can mark summaries as outdated.
+		if dataSet != nil && dataSet.HasDeduplicatorName() {


If I'm understanding things correctly, I don't think this will cause issues with the current Jellyfish migration/deprecation plan. The plan does not currently include any updates to the data set (just data). Thus, a Jellyfish data set will not have a deduplicator name and so this code will not be executed.

If, however, we do need to update the data set with the deduplicator name, then this will cause a couple of potential issues:

The Jellyfish data set and data may include some fields normally only present in Platform data (written by DataRepository.UpdateDataSet and DataRepository.ActiveDataSetData). We'd need to do a survey of exactly what fields are added by Jellyfish or Platform and compare the two. Even if there are field differences they may not cause any real issue, just inconsistencies.

The Platform code that performs data duplication will execute on Jellyfish data (where other Jellyfish data is duplicated based upon this Jellyfish data using the Platform deduplication mechanism). I don't think this will cause an issue, though, since Jellyfish will have already deduplicated the data (by dropping any new duplicates) prior to closing of the Jellyfish data set. Thus, I think that this data deduplication step would end up just being a NOP.

In either case, we should verify this definitely won't cause issues and add it as one or more test cases, but at first glance there doesn't appear to be any (or if there are some they would be minor).

cc: @jh-bate

The base branch was changed.

toddkazakov · 2024-02-21T17:47:21Z

/query qa5

toddkazakov · 2024-02-21T17:47:37Z

/query qa2

toddkazakov and others added 18 commits October 19, 2023 18:50

Add reason for setting last updated and outdated since

7c4a31b

add more specific conditions for SetOutdated triggers

c42af18

fix some unit tests

376b05b

update to ginkgo v2 and add some unit tests

c267c7f

fix build and unit tests, still incomplete testing

c535a59

fix formatting

dcbe1fc

possibly fix make test

99d13a5

more test fixes

3b67506

travis use go 1.21 for real

c554a3b

update vendor

99611b2

more unit tests

8fa2e8d

more unit tests

15dc174

formatting

f407953

add outdatedSinceLimit and minor bugfixes

4c70f63

zero out OutdatedSinceLimit on update

b5e7ec5

formatting

020518c

update deps

325b5f8

Allow closing datasets created by jellyfish we can recalculate summar…

1cea4b9

…ies correctly

toddkazakov requested a review from Roukoswarf October 27, 2023 16:18

toddkazakov changed the title ~~[BACK-2702] Allow closing datasets created by jellyfish we can recalculate summaries correctly~~ [BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly Oct 30, 2023

darinkrauss previously approved these changes Oct 30, 2023

View reviewed changes

Roukoswarf changed the base branch from ehr-triggers-spike to master November 6, 2023 04:18

Roukoswarf changed the base branch from master to ehr-triggers-spike November 6, 2023 04:19

Base automatically changed from ehr-triggers-spike to master November 16, 2023 18:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly #677

[BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly #677

toddkazakov commented Oct 27, 2023 •

edited

Loading

darinkrauss left a comment

darinkrauss Oct 30, 2023

toddkazakov commented Feb 21, 2024

toddkazakov commented Feb 21, 2024

[BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly #677

Are you sure you want to change the base?

[BACK-2702] Allow closing datasets created by jellyfish so we can recalculate summaries correctly #677

Conversation

toddkazakov commented Oct 27, 2023 • edited Loading

darinkrauss left a comment

Choose a reason for hiding this comment

darinkrauss Oct 30, 2023

Choose a reason for hiding this comment

toddkazakov commented Feb 21, 2024

toddkazakov commented Feb 21, 2024

toddkazakov commented Oct 27, 2023 •

edited

Loading