{EpiNow} has been depreciated in favour of {EpiNow2}
This package estimates the time-varying reproduction number, rate of
spread, and doubling time using a range of open-source tools and current
best practices. It aims to help users avoid some of the limitations of
naive implementations in a framework that is informed by community
feedback and is under active development. It assumes that only limited
data is available on cases by date of onset and instead uses cases by
date of report. These are then imputed to case counts by date of
infection using an uncertain reporting delay and incubation period.
Right truncation of cases is dealt with internally by {EpiNow}
, as is
propogating uncertainty from all inputs into the final parameter
estimates (helping to mitigate spurious findings). Time-varying
estimates of the reproduction number are estimated using the
{EpiEstim}
package by date of
infection with a generation time estimate that includes uncertainty.
Time-varying estimates of the rate of growth are derived using a
quasipoisson GLM with a sliding window, which are then used to estimate
the doubling time. Optimal windows are chosen by using one day ahead
case prediction. Optionally, the time-varying reproduction number can be
forecast forwards in time using an integration with the
{EpiSoon}
package and converted to
a case forecast using a branching process. See the
methods section of our
Covid-19 site for a detailed discussion of the approach.
Install the stable version of the package using
{drat}
:
install.packages("drat")
drat:::add("epiforecasts")
install.packages("EpiNow")
Install the development version of the package with:
remotes::install_github("epiforecasts/EpiNow")
For simple deployment/development a prebuilt docker image is also available (see documentation here).
{EpiNow}
is designed to be used at scale with few changes to the
defaults and a single function call or to be used in an ad-hoc fashion
via individual function calls. In the following section we give an
overview of the simple use case. For more on using each function see the
function
documentation and
introductory
vignette.
A working implementation for COVID-19 can be found
here.
This quick start requires the following packages:
library(EpiNow)
library(EpiSoon)
library(data.table)
Reporting delays can either be fitted using package functionality or
determined elsewhere and then defined with uncertainty for use in
{EpiNow}
. When data is supplied an interval censored gamma and
exponential distribution will be fit and then compared using the {loo}
package. Note that in this example a single bootstrap is used (i.e no
bootstrap) but in real scenarios multiple bootstraps should be used to
represent the uncertainty in the reported distribution (and to
approximate stochastic change over time). The commented code (requires
the {future}
package) can be used to parallelise long running sections
of code.
# future::plan("multiprocess")
example_delays <- rexp(25, 1/10)
delay_dist <- EpiNow::get_dist_def(example_delays,
samples = 5, bootstraps = 1)
delay_dist
#> model max_value params
#> 1: exp 51 <list>
#> 2: exp 51 <list>
#> 3: exp 51 <list>
#> 4: exp 51 <list>
#> 5: exp 51 <list>
Alternatively an uncertain distribution can be defined (for example based on literature estimates). Currently supported distributions are the log normal and gamma distributions.
delay_dist <- EpiNow::lognorm_dist_def(mean = 5, mean_sd = 1, sd = 3, sd_sd = 1,
max_value = 30, samples = 5, to_log = TRUE)
delay_dist
#> model params max_value
#> 1: lognorm <list> 30
#> 2: lognorm <list> 30
#> 3: lognorm <list> 30
#> 4: lognorm <list> 30
#> 5: lognorm <list> 30
This wraps the core functionality of the package and includes results
reporting. It requires a data frame of cases by date of report and a
dist_skel
compatible data frame of reporting delay distributions (as
produced by get_dist_def
or lognorm_dist_def
etc.). Internally
current best estimates of the incubation period, generation time and
reproduction number are then used but these can also be manually
specified (see
here for
the code that generates these estimates). Whilst defaults are likely to
work for most users the
documentation
provides additional options. For regions with high cases loads users
should consider using approximate sampling (approx_delay
). Forecasting
is supported via EpiSoon
and companion packages (see documentation for
an example). If data by date of onset is available this can be passed to
linelist
and used rather than sampling dates of onsets - note this is
currently only partially supported functionality. Please open an issue
if this functionality is needed for your use case.
Save everything to a temporary directory - change this (rt_pipeline
will create the directory for you) to inspect the results
target_dir <- file.path(tempdir(), "test")
Load example case data from {EpiSoon}
and convert to have the required
variable names (note imported cases are supported and should be
delinated with import_status = "imported"
).
cases <- data.table::setDT(EpiSoon::example_obs_cases)
cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")][,
cases := NULL]
tail(cases)
#> date confirm import_status
#> 1: 2020-03-17 296 local
#> 2: 2020-03-18 343 local
#> 3: 2020-03-19 399 local
#> 4: 2020-03-20 454 local
#> 5: 2020-03-21 605 local
#> 6: 2020-03-22 367 local
Run the complete pipeline that includes nowcasting cases, estimating the time-varying reproduction number, the rate of growth and the doubling time. See the documentation for an example that incorporates forecasting.
# future::plan("multiprocess")
rt_pipeline(cases = cases,
delay_defs = delay_dist,
target_date = max(cases$date),
target_folder = target_dir)
List available output files.
list.files(target_dir, recursive = TRUE)
#> [1] "2020-03-22/adjusted_r_latest.rds"
#> [2] "2020-03-22/bigr_eff_latest.rds"
#> [3] "2020-03-22/bigr_eff_max_estimate.rds"
#> [4] "2020-03-22/bigr_eff_plot.png"
#> [5] "2020-03-22/bigr_eff_plot.rds"
#> [6] "2020-03-22/bigr_estimates.rds"
#> [7] "2020-03-22/case_forecast.rds"
#> [8] "2020-03-22/cases_by_report.rds"
#> [9] "2020-03-22/cases_plot.png"
#> [10] "2020-03-22/current_cases.rds"
#> [11] "2020-03-22/delays.rds"
#> [12] "2020-03-22/doubling_time_latest.rds"
#> [13] "2020-03-22/incubation.rds"
#> [14] "2020-03-22/latest_date.rds"
#> [15] "2020-03-22/nowcast.rds"
#> [16] "2020-03-22/plot_cases.rds"
#> [17] "2020-03-22/prob_control_latest.rds"
#> [18] "2020-03-22/rate_spread_estimates.rds"
#> [19] "2020-03-22/rate_spread_latest_summary.rds"
#> [20] "2020-03-22/rate_spread_latest.rds"
#> [21] "2020-03-22/rate_spread_overall_summary.rds"
#> [22] "2020-03-22/rate_spread_plot.png"
#> [23] "2020-03-22/rate_spread_plot.rds"
#> [24] "2020-03-22/region_summary.rds"
#> [25] "2020-03-22/rt_cases_plot.png"
#> [26] "2020-03-22/rt_cases_plot.rds"
#> [27] "2020-03-22/summarised_littler.rds"
#> [28] "2020-03-22/summarised_nowcast.rds"
#> [29] "2020-03-22/summarised_reff.rds"
#> [30] "2020-03-22/time_varying_littler.rds"
#> [31] "2020-03-22/time_varying_params.rds"
#> [32] "latest/adjusted_r_latest.rds"
#> [33] "latest/bigr_eff_latest.rds"
#> [34] "latest/bigr_eff_max_estimate.rds"
#> [35] "latest/bigr_eff_plot.png"
#> [36] "latest/bigr_eff_plot.rds"
#> [37] "latest/bigr_estimates.rds"
#> [38] "latest/case_forecast.rds"
#> [39] "latest/cases_by_report.rds"
#> [40] "latest/cases_plot.png"
#> [41] "latest/current_cases.rds"
#> [42] "latest/delays.rds"
#> [43] "latest/doubling_time_latest.rds"
#> [44] "latest/incubation.rds"
#> [45] "latest/latest_date.rds"
#> [46] "latest/nowcast.rds"
#> [47] "latest/plot_cases.rds"
#> [48] "latest/prob_control_latest.rds"
#> [49] "latest/rate_spread_estimates.rds"
#> [50] "latest/rate_spread_latest_summary.rds"
#> [51] "latest/rate_spread_latest.rds"
#> [52] "latest/rate_spread_overall_summary.rds"
#> [53] "latest/rate_spread_plot.png"
#> [54] "latest/rate_spread_plot.rds"
#> [55] "latest/region_summary.rds"
#> [56] "latest/rt_cases_plot.png"
#> [57] "latest/rt_cases_plot.rds"
#> [58] "latest/summarised_littler.rds"
#> [59] "latest/summarised_nowcast.rds"
#> [60] "latest/summarised_reff.rds"
#> [61] "latest/time_varying_littler.rds"
#> [62] "latest/time_varying_params.rds"
Read in and examine the output nowcast cases.
summarised_nowcast <- readRDS(paste0(target_dir, "/latest/summarised_nowcast.rds"))
tail(summarised_nowcast)
#> date type bottom top lower upper median mean
#> 1: 2020-03-17 Observed by report date NA NA NA NA 296 NA
#> 2: 2020-03-18 Observed by report date NA NA NA NA 343 NA
#> 3: 2020-03-19 Observed by report date NA NA NA NA 399 NA
#> 4: 2020-03-20 Observed by report date NA NA NA NA 454 NA
#> 5: 2020-03-21 Observed by report date NA NA NA NA 605 NA
#> 6: 2020-03-22 Observed by report date NA NA NA NA 367 NA
#> confidence
#> 1: 1
#> 2: 1
#> 3: 1
#> 4: 1
#> 5: 1
#> 6: 1
Read in and examine the output time-varying estimates.
time_varying_params <- readRDS(paste0(target_dir, "/latest/time_varying_params.rds"))
names(time_varying_params)
#> [1] "R0" "rate_of_spread" "raw_R0"
Examine the summarised Rt estimates.
tail(time_varying_params$R0)
#> type date rt_type bottom top lower upper median
#> 1: nowcast 2020-03-09 nowcast 1.201786 1.482101 1.219509 1.300683 1.289869
#> 2: nowcast 2020-03-10 nowcast 1.163689 1.493488 1.163689 1.279485 1.279485
#> 3: nowcast 2020-03-11 nowcast 1.227604 1.502693 1.227604 1.267199 1.267199
#> 4: nowcast 2020-03-12 nowcast 1.179222 1.342219 1.250780 1.307949 1.265503
#> 5: nowcast 2020-03-13 nowcast 1.124467 1.350155 1.253722 1.350155 1.255616
#> 6: nowcast 2020-03-14 nowcast 1.054925 1.302339 1.086232 1.214977 1.184348
#> mean std prob_control mean_window sd_window mean_crps sd_crps
#> 1: 1.318159 0.10372410 0.00 3.4 0.8164966 5.744 1.846546
#> 2: 1.327028 0.10624934 0.00 4.2 1.6329932 3.672 1.038717
#> 3: 1.306971 0.10383570 0.00 4.2 2.4494897 4.288 1.207035
#> 4: 1.262450 0.05103133 0.00 3.6 1.6583124 7.736 4.043480
#> 5: 1.241697 0.07963535 0.00 4.0 2.3273733 10.296 9.058793
#> 6: 1.176528 0.08920745 0.04 1.8 0.7637626 7.552 3.564492
#> R0_range
#> 1: <list>
#> 2: <list>
#> 3: <list>
#> 4: <list>
#> 5: <list>
#> 6: <list>
Examine the Rt and case plot.
knitr::include_graphics(paste0(target_dir, "/latest/rt_cases_plot.png"))
This function provides a wrapper to {rt_pipeline}
that allows it to be
run on multiple regions at once with the same assumed report delay.
Define a new target directory.
target_dir <- file.path(tempdir(), "test-regional")
Define cases in multiple regions delineated by the region variable.
cases <- data.table::rbindlist(list(
data.table::copy(cases)[, region := "testland"],
cases[, region := "realland"]))
Run the pipeline on each region in turn. The commented code (requires
the {future}
package) can be used to run regions in parallel (see
here
for an optimised nested example).
# future::plan("multiprocess")
regional_rt_pipeline(cases = cases,
delay_defs = delay_dist,
target_folder = target_dir)
List output folders.
list.files(target_dir, recursive = FALSE)
#> [1] "realland" "testland"
Summarise the results accross all regions.
EpiNow::regional_summary(results_dir = file.path(tempdir(), "test-regional"),
summary_dir = file.path(tempdir(), "test-summary"),
target_date = "latest",
region_scale = "Country",
csv_region_label = "country",
log_cases = TRUE)
An example of the summary output can be seen here.
Rmarkdown templates are provided in the package (templates
) for
semi-automated reporting of the estimates. These are currently
undocumented but an example integration can be seen
here.
If using these templates to report your results please highlight our
limitations as these are key to
understanding our results.
File an issue here if you have identified an issue with the package. Please note that due to operational constraints priority will be given to users informing government policy or offering methodological insights. We welcome all contributions, in particular those that improve the approach or the robustness of the code base.