-
Notifications
You must be signed in to change notification settings - Fork 10
/
README.Rmd
126 lines (96 loc) · 4.37 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
options(width = 76)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# epipredict
<!-- badges: start -->
[![R-CMD-check](https://github.com/cmu-delphi/epipredict/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/cmu-delphi/epipredict/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
**Note:** This package is currently in development and may not work as expected. Please file bug reports as issues in this repo, and we will do our best to address them quickly.
## Installation
To install (unless you're making changes to the package, use the stable version):
```r
# Stable version
pak::pkg_install("cmu-delphi/epipredict@main")
# Dev version
pak::pkg_install("cmu-delphi/epipredict@dev")
```
## Documentation
You can view documentation for the `main` branch at <https://cmu-delphi.github.io/epipredict>.
## Goals for `epipredict`
**We hope to provide:**
1. A set of basic, easy-to-use forecasters that work out of the box. You should be able to do a reasonably limited amount of customization on them. For the basic forecasters, we currently provide:
* Baseline flatline forecaster
* Autoregressive forecaster
* Autoregressive classifier
* CDC FluSight flatline forecaster
2. A framework for creating custom forecasters out of modular components. There are four types of components:
* Preprocessor: do things to the data before model training
* Trainer: train a model on data, resulting in a fitted model object
* Predictor: make predictions, using a fitted model object
* Postprocessor: do things to the predictions before returning
**Target audiences:**
* Basic. Has data, calls forecaster with default arguments.
* Intermediate. Wants to examine changes to the arguments, take advantage of
built in flexibility.
* Advanced. Wants to write their own forecasters. Maybe willing to build up
from some components.
The Advanced user should find their task to be relatively easy. Examples of
these tasks are illustrated in the [vignettes and articles](https://cmu-delphi.github.io/epipredict).
See also the (in progress) [Forecasting Book](https://cmu-delphi.github.io/delphi-tooling-book/).
## Intermediate example
The package comes with some built-in historical data for illustration, but
up-to-date versions of this could be downloaded with the
[`{epidatr}` package](https://cmu-delphi.github.io/epidatr/)
and processed using
[`{epiprocess}`](https://cmu-delphi.github.io/epiprocess/).[^1]
[^1]: Other epidemiological signals for non-Covid related illnesses are also
available with [`{epidatr}`](https://github.com/cmu-delphi/epidatr) which
interfaces directly to Delphi's
[Epidata API](https://cmu-delphi.github.io/delphi-epidata/)
```{r epidf, message=FALSE}
library(epipredict)
covid_case_death_rates
```
To create and train a simple auto-regressive forecaster to predict the death rate two weeks into the future using past (lagged) deaths and cases, we could use the following function.
```{r make-forecasts, warning=FALSE}
two_week_ahead <- arx_forecaster(
covid_case_death_rates,
outcome = "death_rate",
predictors = c("case_rate", "death_rate"),
args_list = arx_args_list(
lags = list(c(0, 1, 2, 3, 7, 14), c(0, 7, 14)),
ahead = 14
)
)
two_week_ahead
```
In this case, we have used a number of different lags for the case rate, while
only using 3 weekly lags for the death rate (as predictors). The result is both
a fitted model object which could be used any time in the future to create
different forecasts, as well as a set of predicted values (and prediction
intervals) for each location 14 days after the last available time value in the
data.
```{r print-model}
two_week_ahead$epi_workflow
```
The fitted model here involved preprocessing the data to appropriately generate
lagged predictors, estimating a linear model with `stats::lm()` and then
postprocessing the results to be meaningful for epidemiological tasks. We can
also examine the predictions.
```{r show-preds}
two_week_ahead$predictions
```
The results above show a distributional forecast produced using data through
the end of 2021 for the 14th of January 2022. A prediction for the death rate
per 100K inhabitants is available for every state (`geo_value`) along with a
90% predictive interval.