-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reported_cases_opts() #346
Comments
Perhaps |
A downside of this approach is that, if we want a common An alternative solution would be to keep the |
Trying to separate the steps involved in closing the issue if going with the approach suggested in the previous comment:
Once this issue is closed we can then make the data column argument more flexible, addressing #505 |
Makes sense to me. Would bullet 4 require us to deprecate the current names? |
Yes, I think so. |
I can work on this if it's good to go. |
Yes, it should be good to go - I think ideally addressing each of the bullet points in sequence using separate PRs. |
I might be wrong but the proposed |
Good question. There are two options:
Option (2) is might be the easier one as it doesn't require updating any internal logic (e.g. where some internal processing is done before calling |
|
If going down that route we probably want to rename it to |
Two options seem to be apparent here:
|
In particular, the |
I really like those suggestions. I think they might take a bit more thinking though so would suggest to push them into a future release in order to get 1.5.0 out asap. |
Agreed. They're not user-facing so not a priority for this release. |
Proposed set up for data handling / cleaning would be to
In principle the horizon stuff could also be separate in an |
We probably still want a |
I like this idea a lot |
@jamesmbaazam what do you think? |
I like all the points.
I think the forecasting stuff should probably be done internally using Maybe,
Maybe, more specifically, |
I'm generally pretty sceptical of this as an idea. If users are going to get useful things out of the package they likely have thoughts on how to change the name of a variable already. Or is the proposal you then track their column name through the code base? Again I am a bit sceptical of the value added here to most users. |
I generally agree, just flagging that if ever going ahead with the suggestion in #371 (comment) we might need a way to point out which column in a passed data frame corresponds to which observation model (though that'll likely look different from a Somewhere we might also want users to specify what the data represent, i.e. #505 |
It seems we broadly have two options here: obs |>
rename(value = confirm) |>
filter_leading_zeroes() |>
apply_zero_threshold(threshold = 10) |>
add_horizon(n = 3, frequency = 7) |>
estimate_infections() or obs |>
estimate_infections(
data = data_opts(col = "confirm", zero_threshold = 10),
forecast = forecast_opts(n = 3, frequency = 7)
) |
I think the second cannot be piped that way as by definition, the pipe passes the data to the first argument. So it will rather be estimate_infections(
data = data_opts(obs, col = "confirm", zero_threshold = 10),
forecast = forecast_opts(n = 3, frequency = 7)
) Until now, users have not had to do any data cleaning using exported functions here, so I am more inclined to vote for the second option. |
If the case data set becomes part of |
That is a valid point. A counterpoint would be that with the explicit functions users can actually see what happens (e.g. which values get filtered out / changed, or which dates will be forecast) whereas if it's all internal to |
We already have a
Valid point. Alternatively, we could improve the logging and messaging in the current setup to report all of this. |
Currently, all inputs except for
reported_cases
are managed via a helper function (i.edelay_opts()
) that enables specifying options. It would make sense to standardizedreported_cases
to be in line with this approach. This would also make sense as currently there are several data processing steps (i.e to deal with missing dates) that occur inside package code that are not well surfaced to the user. Putting these in areported_cases_opts()
function would help resolve this and bring the user closer to the data being used for model fitting which is generally a good idea.The text was updated successfully, but these errors were encountered: