The main codebase to run our analysis is in analyze.py
. The ExponentialGrowthRateEstimator
takes in hyperparameters
cumulative
(Boolean) indicating whether the counts that will be passed in are cumulative counts or new cases.
approximate_beta
is used to interpolate missing data (days where the cumulative count is unknown).
family
is the generalized linear model family to use: either Poisson
or NegativeBinomial
. We recommend
using NegativeBinomial
, as the variance estimates from the Poisson seem to be a little optimistic, given how
messy COVID-19 count data is. We recommend a default alpha
value (for the Negative Binomial distribution) of 0.10
or 0.15
.
Then, pass the numpy vectors day
indicating the day of a measurement and cases
indicating the respective
count of new or cumulative cases, to fit
.
The growth_rate
and growth_rate_confint
methods return the growth rate and 95% confidence intervals,
respectively.
A simple example of calling the analysis code is available in Simple Example.ipynb
.
See the Jupyter notebooks in Process States.ipynb
or Main figures.ipynb
for more complex
examples of how we used this analysis to analyze and plot results for many regions at once.
jhu_process_data.py will read the JHU data into: dict['country']['city']=array. The JHU repository is currently being reformatted so there may be some inconsistencies.
To pull the CSV files for global death data and state-level data, run the following two scripts:
sh pull_global.sh sh pull_states.sh
You'l probably want to run these roughly daily to keep the sources up to date.