-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 793: Improve documentation for those specifying a non-parametric delay #799
Conversation
Thanks @kaitejohnson, this is great and definitely a very good idea to clarify. I might be misremembering but I think generation intervals are in fact zero-indexed but the 0-component is set to zero as of version 1.4.0, here ( EpiNow2/inst/stan/functions/delays.stan Line 64 in 95c3643
The point that this is not clear and easy to get wrong by accident obviously stands regardless. I think it would be great to clarify also in
I agree. |
I see I missed that change, can adjust the language accordingly. To make sure my interpretation is now correct, if a user now passes in a GI pmf, it should be 0-indexed (just like the other delay pmfs), because under the hood the left truncation and renormalization occurs (in the function you sent). |
Yes, exactly. |
Looking at the code this does appear to be the case and I must say I had no idea. I think we should throw an information message in |
If If a user wishes to specify a PMF bin-by-bin, it's reasonable to ask them to decide explicitly how to handle any mass in the 0th bin. So failing if a user falsely assumes 1-indexing is imo worth the cost of forcing drop-and-renormalize users to perform that operation manually. |
That, I think, is a really good suggestion and would do away with need for an explicit warning in the case that the first bin is zero. |
The vignette is updated to explain that the GI should be 0 indexed with a mass of 0 on the first element. I didn't adjust any documentation in If this feels like too much detail for a vignette, feel free to ignore. I think the warning will flag for most using a fixed non-parametric GI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Content-wise this looks great now.
Just one request: could you move these changes over to EpiNow2.Rmd.orig
? The .Rmd
files are generated from these with pre-computed results (so they don't have to be re-rendered every time we build the package which takes quite a lot of time/computation). We can then build the Rmd via the corresponding action in https://github.com/epiforecasts/EpiNow2/blob/main/.github/workflows/render-EpiNow2.yaml
And add yourself to the DESCRIPTION as a contributor |
Co-authored-by: Sebastian Funk <[email protected]>
I am guessing its ok that I didn't remove the changes from |
You’re guessing right I think. |
If this is not the case, a warning will indicate that the vector is being left-truncated and renormalized. | ||
|
||
```r | ||
example_non_parametric_gi <- NonParametric(pmf = c(0, 0.3, 0.5, 0.2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this has been merged but I was catching up and noticed the following: given the preceding text before the chunk, I would have expected to see the warning showcased here, i.e., the example should use gt_opts()
instead of the direct call to NonParametric()
like so
> gt_opts(NonParametric(pmf = c(0.1, 0.3, 0.5, 0.2)))
- nonparametric distribution
PMF: [0.091 0.27 0.45 0.18]
Warning message:
Specifying nonparametric generation times with nonzero first element was deprecated in
EpiNow2 1.6.0.
ℹ Since zero generation times are not supported by the model, the generation time will be
left-truncated at one.
ℹ In future versions this will cause an error. Please ensure that the first element of
the nonparametric generation interval is zero.
ℹ The deprecated feature was likely used in the EpiNow2 package.
Please report the issue at <https://github.com/epiforecasts/EpiNow2/issues>.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamesmbaazam Do you think it makes sense to include a non-parametric pmf that produces this warning as you have shown?
The one currently included has a value of 0 on day 0, so it won't produce the warning! Perhaps the preceding text isn't clear enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the text is fine but personally would have expected the code sample to showcase the warning thrown when the delay is not 0-indexed. Additionally, I would suggest to explicitly print the results of example_non_parametric_gi
and example_non_parametric_delay
in the vignette.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can open an issue and subsequent PR to add this.
Do you think there should be an example for both the correct specification (so with the 0 on day 0) and the incorrect specification that results in a warning (as you demonstrated)? My only concern is about bloating the vignette, otherwise I think it could make sense to show both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced we want to show a way of specifying that we're discouraging in the warning (and that will cause an error in the future). Perhaps we should just remove the highlighted sentence if it's confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbfnk You're right about not encouraging the wrong specification.
Maybe, we can reword this part "If this is not the case, a warning will indicate that the vector is being left-truncated and normalized." -> "If this is not the case, the vector will be left-truncated and normalized."
I do see that the doc of gt_opts()
has the following wording: "Because the discretised renewal equation used in the package does not support zero generation times, any distribution specified here will be left-truncated at one, i.e. the first element of the nonparametric or discretised probability distribution used for the generation time is set to zero and the resulting distribution renormalised." Shouldn't we just reuse that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think that in the vignette we should only tell users how to specify this correctly and not what happens if they fail to do so (which they'll find out with the warning anyway). So I'd vote for removing the sentence altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaitejohnson Would you like to take this one 😃 ?
Description
This PR closes #793.
It expands on the vignette details describing how to specify a generation interval and reporting delay. Specifically, it gives an example of how to provide a fixed non-parametric vector delay distribution. I added language explaining that the generation interval should be indexed starting at day 1 of an infection whereas the reporting delays should be indexed starting at day 0 (as is consistent with the model definition, but might not be clear to a user).
I looked through the
generation_time_opts()
and the distributions handling but wasn't sure if this distinction was appropriate in either of those, happy to add some language about this in the documentation if theres a specific place you all think is most appropriate.The context for this was that its easy to accidentally pass in a shifted by one generation interval distribution if assuming the GI is 0 indexed.
I think a separate issue could be to use
primarycensoreddist
to generate a GI pmf indata-raw
and pass that in as an example either instead of or in addition to the currentexample_generation_time
Initial submission checklist
devtools::test()
anddevtools::check()
).devtools::document()
).lintr::lint_package()
).After the initial Pull Request