Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add error message for missing strata #227

Open
wants to merge 94 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
88a24bd
add error message for missing strata
chrisorwa Aug 1, 2024
5ec8ec0
run document() for est.incidence.by
chrisorwa Aug 1, 2024
b82e3b5
add any() to is.element()
chrisorwa Aug 4, 2024
3c271ee
add test
chrisorwa Aug 7, 2024
ee3d7ca
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Aug 12, 2024
39ea75b
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Aug 14, 2024
0cb88d6
modify error messaging
chrisorwa Aug 14, 2024
97efcf3
Merge branch 'add-user-error-message-for-missing-strata' of https://g…
chrisorwa Aug 14, 2024
b3523c7
make changes to tests and est.incidence.by
chrisorwa Aug 21, 2024
c09df9b
add purrr to DESCRIPTION
chrisorwa Aug 22, 2024
42c7af6
modify error message
chrisorwa Aug 26, 2024
0f36e94
correct errors
chrisorwa Aug 26, 2024
f97218a
remove testthat from DESCRIPTION
chrisorwa Aug 26, 2024
757676e
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Aug 27, 2024
13be640
in progress
d-morrison Aug 27, 2024
87625a1
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Sep 10, 2024
3bf79b3
lint file
chrisorwa Sep 10, 2024
927911a
lint tests
chrisorwa Sep 11, 2024
655985b
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Sep 12, 2024
8302735
increment version
chrisorwa Sep 17, 2024
47dd63a
line break
d-morrison Sep 23, 2024
2ef8e8e
Merge branch 'main' into add-user-error-message-for-missing-strata
d-morrison Sep 23, 2024
6b365c0
increment version
d-morrison Sep 24, 2024
7cac80b
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Sep 29, 2024
cc000f7
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Sep 30, 2024
d3a8b8c
correct conflicts
chrisorwa Oct 13, 2024
c22ed3c
requested changes
chrisorwa Oct 13, 2024
1eb1441
push additional changes to tests
chrisorwa Oct 15, 2024
58e4598
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Oct 18, 2024
bc7fe55
Merge branch 'main' into add-user-error-message-for-missing-strata
chrisorwa Oct 22, 2024
98c881d
reduce tests
Oct 22, 2024
474d54c
Merge branch 'add-user-error-message-for-missing-strata' of https://g…
Oct 22, 2024
a2fc79a
reduce num_cores to 1
chrisorwa Oct 22, 2024
05aabc1
add snapshots
chrisorwa Oct 22, 2024
c73eb38
linting
chrisorwa Oct 22, 2024
a78d82a
correct tests
chrisorwa Oct 23, 2024
23c8ec2
Increment version number to 1.2.0.9019
chrisorwa Oct 23, 2024
5107025
more edits
d-morrison Nov 4, 2024
c0bd521
in progress
d-morrison Nov 4, 2024
3c8b8b1
lots
d-morrison Nov 5, 2024
cebb5be
more cleanup
d-morrison Nov 5, 2024
a5f4ff6
update documentation
d-morrison Nov 5, 2024
7dc8a6c
- added {and} package
d-morrison Nov 5, 2024
7cb6af9
cleanup
d-morrison Nov 5, 2024
41435d6
add `glue` and `stringr`
d-morrison Nov 5, 2024
737a1c6
add test to `check_strata()`
d-morrison Nov 5, 2024
7ba202b
remove redundant example data
d-morrison Nov 5, 2024
4232c42
wrap
d-morrison Nov 5, 2024
78af1c6
Merge branch 'simplifications' into use-test-data
d-morrison Nov 5, 2024
4b496de
Merge pull request #316 from UCD-SERG/use-test-data
d-morrison Nov 5, 2024
8478415
remove redundant files
d-morrison Nov 5, 2024
9c9626f
delete unused files
d-morrison Nov 5, 2024
28cdd72
Merge remote-tracking branch 'origin/use-test-data' into simplifications
d-morrison Nov 5, 2024
154fb5e
fixes
d-morrison Nov 5, 2024
a73c8db
Merge branch 'lint-changed-faster' into simplifications
d-morrison Nov 5, 2024
0fa2294
Merge pull request #318 from UCD-SERG/use-test-data
d-morrison Nov 5, 2024
6b362f6
Merge branch 'main' into add-user-error-message-for-missing-strata
d-morrison Nov 5, 2024
cb9c6fd
Merge branch 'add-user-error-message-for-missing-strata' into simplif…
d-morrison Nov 5, 2024
2dcff93
fix
d-morrison Nov 5, 2024
f2cce72
increase snapshot tolerance
d-morrison Nov 5, 2024
543788f
lint
d-morrison Nov 5, 2024
3682fac
document
d-morrison Nov 5, 2024
2d251c4
Merge remote-tracking branch 'origin/main' into simplifications
d-morrison Nov 5, 2024
7ad67e7
Increment version number to 1.2.0.9022
d-morrison Nov 5, 2024
d8385bb
remove duplicate line
d-morrison Nov 5, 2024
c017571
try to fix lint
d-morrison Nov 5, 2024
3ae6bed
Merge branch 'main' into add-user-error-message-for-missing-strata
d-morrison Nov 5, 2024
201853d
more linting
d-morrison Nov 5, 2024
c83e0cc
Merge branch 'add-user-error-message-for-missing-strata' into simplif…
d-morrison Nov 5, 2024
266bae1
lint
d-morrison Nov 6, 2024
c3d18bf
Merge branch 'simplifications' of https://github.com/UCD-SERG/serocal…
d-morrison Nov 6, 2024
2587245
new test to hopefully complete patch coverage
d-morrison Nov 6, 2024
c3dfcbd
skip parallel cores test if only one core available
d-morrison Nov 6, 2024
6fba6a8
added test of verbose output
d-morrison Nov 6, 2024
2e395ad
cleanup
d-morrison Nov 6, 2024
491cda6
test verbose and multicore
d-morrison Nov 6, 2024
8742288
more checks
d-morrison Nov 6, 2024
23a8baa
cleaning up time output
d-morrison Nov 6, 2024
5c98bd5
fixes
d-morrison Nov 6, 2024
1a73533
formatting
d-morrison Nov 6, 2024
ae4fc68
Merge branch 'main' into add-user-error-message-for-missing-strata
d-morrison Nov 6, 2024
d99d4da
Merge branch 'add-user-error-message-for-missing-strata' into simplif…
d-morrison Nov 6, 2024
cf0070b
Update R/est.incidence.by.R
chrisorwa Nov 6, 2024
33dc191
Update R/est.incidence.by.R
chrisorwa Nov 6, 2024
c984c18
additional changes
chrisorwa Nov 6, 2024
9122ae8
Merge branch 'add-user-error-message-for-missing-strata' of https://g…
chrisorwa Nov 6, 2024
177d334
change version number
chrisorwa Nov 6, 2024
f953b3b
make changes to NEWS.md
chrisorwa Nov 6, 2024
8dc73c8
Increment version number to 1.2.0.9022
chrisorwa Nov 6, 2024
15a32d8
Merge branch 'add-user-error-message-for-missing-strata' into simplif…
d-morrison Nov 6, 2024
15f2bc2
Merge pull request #245 from UCD-SERG/simplifications
chrisorwa Nov 6, 2024
4727731
Merge branch 'main' into add-user-error-message-for-missing-strata
d-morrison Nov 6, 2024
4e2b8e9
make changes
chrisorwa Nov 7, 2024
1cddf51
Merge branch 'add-user-error-message-for-missing-strata' of https://g…
chrisorwa Nov 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,5 @@ allpopsamples_hlye.csv$
^vignettes/\.quarto$
^vignettes/methodology\.qmd$
^\.quarto$
^man/check_strata\.Rd$
^man/df_to_array\.Rd$
7 changes: 5 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: serocalculator
Title: Estimating Infection Rates from Serological Data
Version: 1.2.0.9021
Version: 1.2.0.9022
Authors@R: c(
person("Peter", "Teunis", , "[email protected]", role = c("aut", "cph"),
comment = "Author of the method and original code."),
Expand Down Expand Up @@ -39,7 +39,10 @@ Imports:
tidyr,
tidyselect,
utils,
purrr
purrr,
and,
glue,
stringr
Suggests:
bookdown,
devtag (>= 0.0.0.9000),
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@

## Internal changes

* Add test for missing strata in `est.incidence.by` (#227)
* Added `snapshot_value` test for `est.incidence()` (#315)

* Sped up `lint-changed-files` GitHub Action (#317)
Expand Down
64 changes: 64 additions & 0 deletions R/check_strata.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#' @title Check a `pop_data` object for requested strata variables
#' @param pop_data a `pop_data` object
#' @param strata a [character] vector
#' @returns [NULL], invisibly
#' @examples
#' sees_pop_data_pk_100 |>
#' check_strata(strata = c("ag", "catch", "Count"))
#' @dev
check_strata <- function(pop_data, strata) {
if (!is.character(strata)) {
cli::cli_abort(
class = "strata are not strings",
message = c(
"x" = "Argument `strata` is not a character vector.",
"i" = "Provide a character vector with names of stratifying variables."
)
)
}

present_strata_vars <- intersect(strata, names(pop_data))
missing_strata_vars <- setdiff(strata, present_strata_vars)

if (length(missing_strata_vars) > 0) {
message0 <- c(
"Can't stratify provided {.arg pop_data}
with the provided {.arg strata}:",
"i" = "variable {.var {missing_strata_vars}}
{?is/are} missing in {.arg pop_data}."
)

partial_matches <-
purrr::map(missing_strata_vars, function(x) {
stringr::str_subset(string = names(pop_data), pattern = x) |>
glue::backtick() |>
and::or()
}) |>
rlang::set_names(missing_strata_vars) |>
purrr::keep(~ length(.x) > 0)

inputs_with_partial_matches <- names(partial_matches) # nolint: object_usage_linter

if (length(partial_matches) > 0) {
partial_matches <-
glue::glue("\"{names(partial_matches)}\": {partial_matches}")

message0 <- c(
message0,
"i" = "The following input{?s} to {.arg strata}
might be misspelled:
{.str {inputs_with_partial_matches}}",
"i" = "Did you mean:",
partial_matches |> rlang::set_names("*")
)
}

cli::cli_abort(
class = "missing_var",
call = rlang::caller_env(),
message = message0
)
}

invisible(NULL)
}
169 changes: 97 additions & 72 deletions R/est.incidence.by.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,22 @@
#' Estimate Seroincidence
#'
#' @description
#' Function to estimate seroincidences based on cross-section serology data and longitudinal
#' Function to estimate seroincidences based on cross-sectional
#' serology data and longitudinal
#' response model.
#'
#' @param pop_data a [data.frame] with cross-sectional serology data per antibody and age, and additional columns corresponding to each element of the `strata` input
#' @param strata a [character] vector of stratum-defining variables. Values must be variable names in `pop_data`.
#' @param curve_strata_varnames A subset of `strata`. Values must be variable names in `curve_params`. Default = "".
#' @param noise_strata_varnames A subset of `strata`. Values must be variable names in `noise_params`. Default = "".
#' @param num_cores Number of processor cores to use for calculations when computing by strata. If set to more than 1 and package \pkg{parallel} is available, then the computations are executed in parallel. Default = 1L.
#' @param pop_data a [data.frame] with cross-sectional serology data per
#' antibody and age, and additional columns corresponding to
#' each element of the `strata` input
#' @param strata a [character] vector of stratum-defining variables.
#' Values must be variable names in `pop_data`.
#' @param curve_strata_varnames A subset of `strata`.
#' Values must be variable names in `curve_params`. Default = "".
#' @param noise_strata_varnames A subset of `strata`.
#' Values must be variable names in `noise_params`. Default = "".
#' @param num_cores Number of processor cores to use for
#' calculations when computing by strata. If set to
#' more than 1 and package \pkg{parallel} is available,
#' then the computations are executed in parallel. Default = 1L.
d-morrison marked this conversation as resolved.
Show resolved Hide resolved
d-morrison marked this conversation as resolved.
Show resolved Hide resolved

#' @details
#'
Expand All @@ -17,7 +25,8 @@
#' and then the data will be passed to [est.incidence()].
#' If for some reason you want to use [est.incidence.by()]
#' with no strata instead of calling [est.incidence()],
#' you may use `NA`, `NULL`, or `""` as the `strata` argument to avoid that warning.
#' you may use `NA`, `NULL`, or `""` as the `strata`
#' argument to avoid that warning.
d-morrison marked this conversation as resolved.
Show resolved Hide resolved
#'
#'
#' @inheritParams est.incidence
Expand All @@ -26,7 +35,9 @@
#'
#' @return
#' * if `strata` has meaningful inputs:
#' An object of class `"seroincidence.by"`; i.e., a list of `"seroincidence"` objects from [est.incidence()], one for each stratum, with some meta-data attributes.
#' An object of class `"seroincidence.by"`; i.e., a list of
#' `"seroincidence"` objects from [est.incidence()], one for each stratum,
#' with some meta-data attributes.
#' * if `strata` is missing, `NULL`, `NA`, or `""`:
#' An object of class `"seroincidence"`.
#'
Expand All @@ -39,7 +50,8 @@
#'
#' curve <- load_curve_params("https://osf.io/download/rtw5k/") %>%
#' filter(antigen_iso %in% c("HlyE_IgA", "HlyE_IgG")) %>%
#' slice(1:100, .by = antigen_iso) # Reduce dataset for the purposes of this example
#' # Reduce dataset for the purposes of this example:
#' slice(1:100, .by = antigen_iso)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment to make the line less than 80 characters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of removing the comment, why not move it to a new line?

#'
#' noise <- load_noise_params("https://osf.io/download//hqy4v/")
#'
Expand All @@ -49,13 +61,13 @@
#' curve_params = curve,
#' noise_params = noise %>% filter(Country == "Pakistan"),
#' antigen_isos = c("HlyE_IgG", "HlyE_IgA"),
#' #num_cores = 8 # Allow for parallel processing to decrease run time
#' # num_cores = 8 # Allow for parallel processing to decrease run time
#' iterlim = 5 # limit iterations for the purpose of this example
#' )
#'
#' summary(est2)
#'
est.incidence.by <- function(

Check warning on line 70 in R/est.incidence.by.R

View workflow job for this annotation

GitHub Actions / lint-changed-files

file=R/est.incidence.by.R,line=70,col=1,[cyclocomp_linter] Functions should have cyclomatic complexity of less than 15, this has 20.

Check warning on line 70 in R/est.incidence.by.R

View workflow job for this annotation

GitHub Actions / lint-changed-files

file=R/est.incidence.by.R,line=70,col=1,[object_name_linter] Variable and function name style should match snake_case or symbols.
pop_data,
curve_params,
noise_params,
Expand All @@ -71,22 +83,27 @@
verbose = FALSE,
print_graph = FALSE,
...) {
if (missing(strata)) {
warning(
"The `strata` argument to `est.incidence.by()` is missing.",
"\n\n If you do not want to stratify your data, ",
"consider using the `est.incidence()` function to simplify your code and avoid this warning.",
"\n\n Since the `strata` argument is empty, `est.incidence.by()` will return a `seroincidence` object, instead of a `seroincidence.by` object.\n"
)
}

strata_is_empty <-
missing(strata) ||
is.null(strata) ||
setequal(strata, NA) ||
setequal(strata, "")
is.null(strata) ||
setequal(strata, NA) ||
setequal(strata, "")
d-morrison marked this conversation as resolved.
Show resolved Hide resolved

if (strata_is_empty) {
cli::cli_warn(
class = "strata_empty",
c(
"The {.arg strata} argument to {.fn est.incidence.by} is missing.",
"i" = "If you do not want to stratify your data,
consider using the {.fn est.incidence} function to
simplify your code and avoid this warning.",
"i" = "Since the {.arg strata} argument is empty,
{.fn est.incidence.by} will return a {.cls seroincidence} object,
instead of a {.cls seroincidence.by} object."
)
)

to_return <-
est.incidence(
pop_data = pop_data,
Expand All @@ -101,7 +118,7 @@
return(to_return)
}

.checkStrata(data = pop_data, strata = strata)
check_strata(pop_data, strata = strata)

.errorCheck(
data = pop_data,
Expand All @@ -110,7 +127,7 @@
)

# Split data per stratum
stratumDataList <- stratify_data(
stratum_data_list <- stratify_data(
antigen_isos = antigen_isos,
data = pop_data %>% filter(.data$antigen_iso %in% antigen_isos),
curve_params = curve_params %>% filter(.data$antigen_iso %in% antigen_isos),
Expand All @@ -120,18 +137,25 @@
noise_strata_varnames = noise_strata_varnames
)

strata_table <- stratumDataList %>% attr("strata")
strata_table <- stratum_data_list %>% attr("strata")

if (verbose) {
message("Data has been stratified.")
message("Here are the strata that will be analyzed:")
print(strata_table)
cli::cli_inform(
c(
"i" = "Data has been stratified.",
"i" = "Here are the strata that will be analyzed:",
""
),
body = strata_table |> capture.output()
)
}

if (num_cores > 1L && !requireNamespace("parallel", quietly = TRUE)) {
warning(
"The `parallel` package is not installed, so `num_cores > 1` has no effect.",
"To install `parallel`, run `install.packages('parallel')` in the console."
cli::cli_warn(
"The `parallel` package is not installed,
so `num_cores > 1` has no effect.",
"To install `parallel`, run `install.packages('parallel')`
in the console."

Check warning on line 158 in R/est.incidence.by.R

View check run for this annotation

Codecov / codecov/patch

R/est.incidence.by.R#L154-L158

Added lines #L154 - L158 were not covered by tests
d-morrison marked this conversation as resolved.
Show resolved Hide resolved
)
}

Expand All @@ -142,11 +166,11 @@
num_cores <- num_cores %>% check_parallel_cores()

if (verbose) {
message("Setting up parallel processing with `num_cores` = ", num_cores, ".")
cli::cli_inform("Setting up parallel processing with
`num_cores` = {num_cores}.")
}


libPaths <- .libPaths()
lib_paths <- .libPaths()
cl <-
num_cores %>%
parallel::makeCluster() %>%
Expand All @@ -155,17 +179,22 @@
parallel::stopCluster(cl)
})

parallel::clusterExport(cl, c("libPaths"), envir = environment())
# Export library paths to the cluster
parallel::clusterExport(cl, "lib_paths", envir = environment())

# Evaluate library loading on the cluster
parallel::clusterEvalQ(cl, {
.libPaths(libPaths)
require(serocalculator) # note - this gets out of sync when using load_all() in development
.libPaths(lib_paths)

Check warning on line 187 in R/est.incidence.by.R

View check run for this annotation

Codecov / codecov/patch

R/est.incidence.by.R#L187

Added line #L187 was not covered by tests
# note - this gets out of sync when using load_all() in development
require(serocalculator)

Check warning on line 189 in R/est.incidence.by.R

View check run for this annotation

Codecov / codecov/patch

R/est.incidence.by.R#L189

Added line #L189 was not covered by tests
require(dplyr)
})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

{
# Perform parallel computation and record execution time
time <- system.time({
fits <- parallel::parLapplyLB(
cl = cl,
X = stratumDataList,
X = stratum_data_list,
fun = function(x) {
do.call(
what = est.incidence,
Expand All @@ -183,63 +212,59 @@
)
}
)
} %>% system.time() -> time
})

if (verbose) {
message("Elapsed time for parallelized code: ")
print(time)
cli::cli_inform(c("i" = "Elapsed time for parallelized code:"),
body = capture.output(time)
)
}
} else {
# fits <- lapply(
# X = stratumDataList,
# FUN = function(x) est.incidence(dataList = x, verbose = verbose, ...))

fits <- list()
# Time progress:
time <- system.time({
fits <- list() # Initialize an empty list for fits

{ # time progress

for (cur_stratum in names(stratumDataList))
{
cur_stratum_vars <-
strata_table %>%
for (cur_stratum in names(stratum_data_list)) {
cur_stratum_vars <- strata_table %>%
dplyr::filter(.data$Stratum == cur_stratum)

if (verbose) {
message("starting new stratum: ", cur_stratum)
cli::cli_inform("starting new stratum: {cur_stratum}")
print(cur_stratum_vars)
}

fits[[cur_stratum]] <-
do.call(
what = est.incidence,
args = c(
stratumDataList[[cur_stratum]],
list(
lambda_start = lambda_start,
antigen_isos = antigen_isos,
build_graph = build_graph,
print_graph = print_graph,
verbose = verbose,
...
)
fits[[cur_stratum]] <- do.call(
what = est.incidence,
args = c(
stratum_data_list[[cur_stratum]],
list(
lambda_start = lambda_start,
antigen_isos = antigen_isos,
build_graph = build_graph,
print_graph = print_graph,
verbose = verbose,
...
)
)
)
}
} %>% system.time() -> time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to lose the system.time() call; we need it below for verbose messaging.

I understand that this was changed to avoid the right-hand assignment operator ->, but there needs to be a left-hand assignment at the beginning of this whole expression instead.

})

if (verbose) {
message("Elapsed time for loop over strata: ")
print(time)
cli::cli_inform(
c("i" = "Elapsed time for loop over strata: "),
body = capture.output(time)
)
}
}

incidenceData <- structure(
incidence_data <- structure(
fits,
antigen_isos = antigen_isos,
Strata = strata_table,
graphs_included = build_graph,
class = "seroincidence.by" %>% union(class(fits))
)

return(incidenceData)
return(incidence_data)
}
Loading
Loading