-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nssp patching code #2000
base: main
Are you sure you want to change the base?
nssp patching code #2000
Conversation
from .run import run_module | ||
|
||
|
||
def good_patch_config(params, logger): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can bake this in the read_params() in delphi_utils.... @nmdefries thoughts?
pydantic doesn't a full pledged support for json schema validation, but seeing how the read_params has no validation what so ever, we could look into it sometime in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I wrote this method and half way through was like hmm this should probably be generalized since it may be applicable to other indicators too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeahh I'm like oh I also would like that for mine 👀 and also saw that the current read params isn't doing any validations at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, could make sense to add this or a similar fn to delphi_utils. Probably we'd want it and other params validation to be separate fns from read_params
, since we don't always want params validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned it the #2002 and I think for right now it's better to remove this and deal with this in another issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@minhkhul it's up to you to keep or remove the good_patch_config
function. While we do intend to move it to utils at some later point, we don't know when that will be. Pros of leaving it here are that it is performing the check, and we can use it as a starting point for the future fn. Cons are that we have to remember it's here and use it, and then also replace it with the utils version once that's available.
Co-authored-by: nmdefries <[email protected]>
Co-authored-by: nmdefries <[email protected]>
@nmdefries None of any other extra steps. The code chunk in the readme was not generating issue-specific dataset all at once, but just for one issue. I use that code and daily scheduling to made a daily copy of what the source API returns and store that copy as in a csv on
I originally put the code chunk in readme to show how if there'a a patching need, one can set up their own daily/weekly backup starting from the start of an outage to the end. But it's really not needed now that I made the relevant csv files on bigchunk-dev-02.delphi.cmu.edu all readable to anyone with access to that server. Then I added the auto-download source backup csv files. So now we avoid having patching be a multi-step process. |
…e_dir not exist/empty + adjust tests accordingly
Note: About custom_run flag, non-patch custom run doesn't really exist in nssp like it does in google-symptoms due to the differences in how the two sources handle revision data and how we grab those data. We'll keep the flag in nssp still to maintain consistency and to disambiguate params.json. |
Description
Add patching feature to nssp.
One detail that set this implementation apart from most of our other indicators is that nssp data comes in our db as weekly form instead of daily.
Basically, all weekly data coming in to our db has the
issue_date
column always marked as the first date of the epiweek that the reporting date falls into, rather than the reporting date itself. For example, I query nssp data from our api that was just put into db on July 5th 2024 and got something like this:Note the issue is
2024-06-30
despite this data coming in on July 5th. It's because2024-06-30
is the start date of epiweek202427
, so any data coming in that week will be marked under that issue date.Furthermore, the reason why we use this directory format is because of this
batch_issue_format
implementation in acquisition, which demands folder format to beissue_yyyymmdd
including for patches of weekly data.Fixes