Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quidel Data patch for 2022-07-13/2022-07-24 #2063

Open
minhkhul opened this issue Sep 18, 2024 · 0 comments
Open

Quidel Data patch for 2022-07-13/2022-07-24 #2063

minhkhul opened this issue Sep 18, 2024 · 0 comments
Labels
data quality Missing data, weird data, broken data
Milestone

Comments

@minhkhul
Copy link
Contributor

Context

This issue is copied from old jira PO board. Here are comments under the original task:

Kathryn Mazaitis
March 31, 2023 at 6:40 PM
Edited

Mitch and I looked into this in more detail offline.

there will not be a patch possible for 7/18

a patch for 7/29 will be possible, using the files in 2022_07_30_08_30

the files in 2022_07_30_08_30 cover the entire expected time period for this patch (end: 20220724 start: 20220614)

as-is, it’s not possible to configure the indicator to interpret the 2022_07_30_08_30 directory as if it were posted for 7/29

For the 7/29 patch, rather than try to make the indicator more flexible, we decided to:

Copy the 2022_07_30_08_30 (Rosenfeld_MyVirenaSARSData_07-28-2022_to_07-28-2022.csv) drop as 2022_07_29_00_00 and upload it to the S3 drop bucket

Add a readme to 2022_07_29_00_00 explaining that it’s a copy of 2022_07_30_08_30

Run the indicator configured to pick up files from 2022_07_29_00_00

(we confirmed in the code) the indicator is already configured to ignore duplicates, so future runs covering this drop period shouldn’t encounter serious problems

Mitchell Skidmore
March 30, 2023 at 6:32 PM

Two discrepancies noticed in the AWS bucket during this timeframe:

No 7/18 drop (Rosenfeld_MyVirenaSARSData_07-17-2022_to_07-17-2022.csv does not exist)

Two 7/30 drops:

2022_07_30_08_30 (Rosenfeld_MyVirenaSARSData_07-28-2022_to_07-28-2022.csv)

2022_07_30_09_45 (Rosenfeld_MyVirenaSARSData_07-29-2022_to_07-29-2022.csv)

7/31 drop looks fine.

Mitchell Skidmore
March 24, 2023 at 1:40 PM

Here is what is missing in this time frame:

date: 20220718 end: 20220713 start: 20220603

date: 20220729 end: 20220724 start: 20220614

However, the indicator isn’t exporting files for 6/3, 7/13 or 7/24 (6/14 did work for the start date), so these issue dates can’t be created. Working with the params to see if this is due to the date specification complexities, but these days seem consistently skipped over.

Need to touch base to figure out if something else is going on here.

Mitchell Skidmore
March 13, 2023 at 5:12 PM

Looking at the DB, only a few dates are missing.

Issue dates 7/18 and 7/29 are missing.

@minhkhul minhkhul added the data quality Missing data, weird data, broken data label Sep 18, 2024
@minhkhul minhkhul added this to the Data patching milestone Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data quality Missing data, weird data, broken data
Projects
None yet
Development

No branches or pull requests

2 participants