Merge pull request #59 from poissonconsulting/reviewjoe

review
poissonconsulting · Apr 30, 2024 · eff6d6d · eff6d6d
2 parents 95b09cf + ec16bcb
commit eff6d6d
Show file tree

Hide file tree

Showing 8 changed files with 65 additions and 30 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -12,9 +12,11 @@ Authors@R: c(
     person("Environment and Climate Change Canada", role = "cph")
   )
 Description: Estimates annual survival, recruitment and population growth
-    for boreal caribou populations using hierarchical Bayesian models.
+    for boreal caribou populations using Bayesian and Maximum Likelihood
+    models with fixed and random effects.
 License: Apache License (== 2.0) | file LICENSE
-URL: https://poissonconsulting.github.io/bboutools/
+URL: https://github.com/poissonconsulting/bboutools,
+    https://poissonconsulting.github.io/bboutools/
 BugReports: https://github.com/poissonconsulting/bboutools/issues
 Depends:
     nimble,

diff --git a/R/devtools-helpers.R b/R/devtools-helpers.R
@@ -0,0 +1,19 @@
+# Copyright 2023 Province of Alberta
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+release_questions <- function() {
+  message(
+    "Have you confirmed Apache 2.0 license at the top of all code files?"
+  )
+}
diff --git a/README.Rmd b/README.Rmd
@@ -23,15 +23,16 @@ library(bboutools)
 [![Codecov test coverage](https://codecov.io/gh/poissonconsulting/bboutools/branch/main/graph/badge.svg)](https://app.codecov.io/gh/poissonconsulting/bboutools?branch=main)
 <!-- badges: end -->
 
-`bboutools` is an R package to estimate the annual survival, recruitment and population growth for boreal caribou using Bayesian models to facilitate direct comparison of estimates across jurisdictions.
+`bboutools` is an R package to estimate the annual survival, recruitment and population growth for boreal caribou populations using Bayesian and Maximum Likelihood models with fixed and random effects.
+It was developed to facilitate direct comparison of estimates across jurisdictions.
 
 ## Installation
 
 To install the latest development version:
 
 ``` r
-# install.packages("pak")
-pak::pak("poissonconsulting/bboutools")
+# install.packages("remotes")
+remotes::install_github("poissonconsulting/bboutools")
 ```
 
 ## Introduction

diff --git a/README.md b/README.md
@@ -13,16 +13,18 @@ coverage](https://codecov.io/gh/poissonconsulting/bboutools/branch/main/graph/ba
 <!-- badges: end -->
 
 `bboutools` is an R package to estimate the annual survival, recruitment
-and population growth for boreal caribou using Bayesian models to
-facilitate direct comparison of estimates across jurisdictions.
+and population growth for boreal caribou populations using Bayesian and
+Maximum Likelihood models with fixed and random effects. It was
+developed to facilitate direct comparison of estimates across
+jurisdictions.
 
 ## Installation
 
 To install the latest development version:
 
 ``` r
-# install.packages("pak")
-pak::pak("poissonconsulting/bboutools")
+# install.packages("remotes")
+remotes::install_github("poissonconsulting/bboutools")
 ```
 
 ## Introduction

diff --git a/man/bboutools-package.Rd b/man/bboutools-package.Rd
diff --git a/vignettes/articles/methods.Rmd b/vignettes/articles/methods.Rmd
@@ -33,21 +33,20 @@ The use of random effects is especially beneficial when some months/years have s
 In the case of sparse data or extreme values, estimates will tend to be pulled toward the mean (i.e., 'shrinkage'). 
 For missing data, the estimate will be equal to the mean. 
 Shrinkage may not be desired if extreme values are likely to represent the true value (e.g., numerous wolf attacks in one year). 
-In this case, a fixed effect model would yield better estimates. 
+In this case, a fixed effect model would yield more reliable estimates. 
 
 Fixed and random effects can be used in Bayesian or frequentist models. 
 
 ## Maximum Likelihood vs Posterior Probability
 
 The frequentist approach simply identifies the parameter values that maximize the likelihood, i.e., have the greatest probability of having produced the data if they were true.
 It does this by searching parameter space for the combination of parameter values with the Maximum Likelihood.
-Parameter estimates for random effects can be estimated using the Laplace approximation (i.e., with software packages [TMB](https://arxiv.org/pdf/1509.00660.pdf) or [Nimble](https://r-nimble.org/html_manual/cha-AD.html#how-to-use-laplace-approximation)). 
 The CIs are calculated using the standard errors, assuming that the likelihood is normally distributed.
 This approach has the advantage of being fast.
 
-The Bayesian approach multiplies the likelihood by the probability of the parameter values being true based on prior knowledge to get the posterior probability of the parameter values being true based on the data.
+The Bayesian approach multiplies the likelihood by the prior probability of the parameter values being true to get the posterior probability of the parameter values being true based on the data.
 Bayesian methods repeatedly sample from the posterior probability distributions using MCMC (Monte Carlo Markov Chain) methods.
-This approach has the advantage of allowing derived parameters such as the population growth rate to be accurately estimated from the primary survival and recruitment parameters.
+This approach has the advantage of allowing derived parameters such as the population growth rate to be easily estimated with full uncertainty from the primary survival and recruitment parameters.
 
 To demonstrate, we use an anonymized data set to compare annual survival estimates from a:  
 
@@ -138,7 +137,9 @@ This is a more straightforward task with Bayesian models.
 
 ## bboutools
 
-`bboutools` provides the option to estimate parameter values using a Maximum Likelihood or a fully Bayesian approach. Random effects are used where appropriate by default. The Bayesian approach also uses biologically reasonable, weakly informative priors by default. 
+`bboutools` provides the option to estimate parameter values using a Maximum Likelihood or a fully Bayesian approach. 
+Random effects are used where appropriate by default. 
+The Bayesian approach also uses biologically reasonable, weakly informative priors by default. 
 `bboutools` provides relatively simple general models that can be used to compare survival, recruitment and population growth estimates across jurisdictions.
 
 By default, the `bboutools` Bayesian method saves 1,000 MCMC samples from each of three chains (after discarding the first halves). 
@@ -213,6 +214,7 @@ for(i in 1:nAnnual) {
 ```
 
 ### Predicted Survival, Recruitment and Population Growth
+
 As ungulate populations are generally polygynous survival and recruitment are estimated with respect to the number of adult (mature) females.
 
 To estimate recruitment following @decesare_estimating_2012, the predicted annual calves per female adult is first divided by two to give the expected number of female calves per adult female (under the assumption of a 1:1 sex ratio).

diff --git a/vignettes/articles/priors.Rmd b/vignettes/articles/priors.Rmd
@@ -11,17 +11,17 @@ knitr::opts_chunk$set(
 )
 ```
 
-A prior distribution represents our prior understanding or belief in the value of an unknown parameter. 
+A prior distribution represents the existing uncertainty in the value of an unknown parameter. 
 The influence of a prior on the posterior distribution will decrease as the number of observations increases. 
-In general, `bboutools` uses biologically reasonable, weakly informative priors by default [@gelman_prior_2017; @mcelreath_statistical_2016].  
+`bboutools` uses weakly informative priors by default [@gelman_prior_2017; @mcelreath_statistical_2016].  
 
-If the user is interested in fitting models without priors, see `bb_fit_recruitment_ml()`/`bb_fit_survival_ml()`, which have identical models but use a frequentist approach (Maximum Likelihood) to parameter estimation. 
+If the user is interested in fitting models without priors, see `bb_fit_recruitment_ml()` and `bb_fit_survival_ml()`, which have identical models but use a frequentist approach (Maximum Likelihood) to parameter estimation. 
 Models fit with Maximum Likelihood are equivalent to Bayesian models with completely uninformative priors [@mcelreath_statistical_2016].  
 
 ### Survival
 
 Given the full model, the expected survival probability for the $i^{th}$ year and $j^{th}$ month is
-$$logit(Survival[i,j]) = b0 + bAnnual[i] + bMonth[j] + bYear \cdot Year[i]$$
+$$\text{logit}(Survival[i,j]) = b0 + bAnnual[i] + bMonth[j] + bYear \cdot Year[i]$$
 Where $bAnnual$ can be a fixed or random effect of categorical year on the intercept on the log-odds scale and $Year$ is the scaled continuous year.  
 
 This model has the following default priors in `bboutools`
@@ -194,6 +194,7 @@ ggplot(data = df) +
 ```
 
 
-As with the above example, the adult female proportion estimate for a Bayesian model with vague priors closely matches the Maximum Likelihood estimate, whereas the Bayesian model with informative prior pulls the estimate toward the prior belief.  
+As with the above example, the adult female proportion estimate for a Bayesian model with vague priors closely matches the Maximum Likelihood estimate, whereas the Bayesian model with informative prior gives less weight to the data.
 
-The $adult\_female\_proportion$ can also simply be fixed. See `bb_fit_recruitment()` for details.
+The $adult\_female\_proportion$ can also simply be fixed. 
+See `bb_fit_recruitment()` for details.
diff --git a/vignettes/bboutools.Rmd b/vignettes/bboutools.Rmd
@@ -41,8 +41,8 @@ In order to install R [@r] the appropriate binary for the users operating system
 
 Once R is installed, the `bboutools` package can be installed from GitHub by executing the following code at the R console
 ```{r, eval=FALSE}
-install.packages("pak")
-pak::pak("poissonconsulting/bboutools")
+install.packages("remotes")
+remotes::install_github("poissonconsulting/bboutools")
 ```
 
 The `bboutools` package can then be loaded into the current session using
@@ -62,7 +62,8 @@ If you discover a bug in `bboutools` please file an issue with a [reprex](https:
 ## Providing Data
 
 Once the `bboutools` package has been loaded, the next task is to provide some data.
-An easy way to do this is to create a comma separated file (`.csv`) with spreadsheet software. Recruitment and survival data should be formatted as in the following anonymized datasets and can be checked to confirm it is in the correct format by using the `bboudata` functions:
+An easy way to do this is to create a comma separated file (`.csv`) with spreadsheet software. 
+Recruitment and survival data should be formatted as in the following anonymized datasets and can be checked to confirm it is in the correct format by using the `bboudata` functions:
 
 ```{r}
 # Recruitment data
@@ -107,7 +108,8 @@ If `adult_female_proportion = NULL`, the adult female proportion is estimated fr
 By default, a biologically informative prior of `Beta(65,35)` is used. 
 This corresponds to an expected value of 65%.  
 
-The yearling female proportion can be set with `yearling_female_proportion`. The default value is 0.5.  
+The yearling female proportion can be set with `yearling_female_proportion`. 
+The default value is 0.5.  
 
 The model can be fit with random effect of year, fixed effect of year and/or continuous effect of year (i.e., year trend). 
 The `min_random_year` argument dictates the minimum number of years in the dataset required to fit a random year effect; otherwise a fixed year effect is fit. 
@@ -132,12 +134,14 @@ glance(recruitment)
 
 Model convergence provides an indication of whether the parameter estimates are reliable. 
 
-Convergence is successful if `ess` > 33% of the number of iterations and `rhat` < 1.1. 
+Convergence is considered successful if `ess` > 33% of the number of iterations and `rhat` < 1.1. 
 `ess` (Effective Sample Size) represents the length of a chain (i.e., number of iterations) if each sample was independent of the one before it. 
 `rhat` evaluates whether the chains agree on the same values. 
 As the total variance of all the chains shrinks to the average variance within chains, r-hat approaches 1.  
 
-By default, the `bboutools` Bayesian method saves 1,000 MCMC samples from each of three chains (after discarding the first halves). The number of samples saved can be adjusted with the `niters` argument. With `niters` set, the user can simply increment the thinning rate as required to achieve convergence (i.e., by increasing `ess` and/or decreasing `rhat`).
+By default, the `bboutools` Bayesian method saves 1,000 MCMC samples from each of three chains (after discarding the first halves). 
+The number of samples saved can be adjusted with the `niters` argument. 
+With `niters` set, the user can simply increment the thinning rate as required to achieve convergence (i.e., by increasing `ess` and/or decreasing `rhat`).
 
 ### Summary
 Various generic functions in `bboutools` can be used to summarize or interrogate the output of model fitting functions.
@@ -157,7 +161,8 @@ Keep in mind that any reference to 'Year' or 'Annual' in these summary outputs r
 
 ### Priors
 In general, weakly informative priors are used by default [@gelman_prior_2017; @mcelreath_statistical_2016]. 
-The default prior distribution parameter values can be accessed from `bb_priors_recruitment()` and `bb_priors_survival()`. See the priors vignette for more information.
+The default prior distribution parameter values can be accessed from `bb_priors_recruitment()` and `bb_priors_survival()`. 
+See the priors vignette for more information.
 
 ```{r}
 bb_priors_recruitment()
@@ -170,7 +175,7 @@ For example, less informative priors for `adult_female_proportion` (e.g., `Beta(
 recruitment <- bb_fit_recruitment(bboudata::bbourecruit_a, priors = c(adult_female_proportion_alpha = 1, adult_female_proportion_beta = 1, b0_mu = 0, b0_sd = 5))
 ```
 
-If the user is interested in fitting models without priors, see `bb_fit_recruitment_ml()`/`bb_fit_survival_ml()`, which use a Maximum Likelihood approach (see more details below). 
+If the user is interested in fitting models without any prior information, see `bb_fit_recruitment_ml()` and `bb_fit_survival_ml()`, which use a Maximum Likelihood approach (see more details below). 
 
 ## Survival
 The annual survival in boreal caribou population is typically estimated from the monthly fates of collared adult females. 
@@ -199,7 +204,8 @@ tidy(survival, include_random_effects = FALSE)
 ## Predictions
 A user can generate and plot predictions of recruitment, survival and population growth.
 
-Recruitment is the adjusted recruitment using  methods from [@decesare_estimating_2012]. See the 'analytical methods' vignette for details. 
+Recruitment is the adjusted recruitment using  methods from [@decesare_estimating_2012]. 
+See the 'analytical methods' article for details. 
 
 Predictions of calf-cow ratio can also be made using `bb_predict_calf_cow_ratio()`. 
 
@@ -258,6 +264,7 @@ bb_plot_year_population_change(predict_change)
 ```
 
 ## Maximum Likelihood
+
 Maximum Likelihood (ML) models can be fit using the `bb_fit_recruitment_ml()` and `bb_fit_survival_ml()` functions. 
 These functions take a few seconds to execute because Nimble must compile the model into C++ code.
 See the [Nimble documentation](https://r-nimble.org/html_manual/cha-AD.html#how-to-use-laplace-approximation) for more information and comparison to TMB.