Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate data sqids #34

Merged
merged 17 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,7 @@ Imports:
httr,
jsonlite,
dplyr,
stringr
stringr,
data.table,
magrittr,
rlang
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,11 @@ export(parse_meta_filter_columns)
export(parse_meta_filter_item_ids)
export(parse_meta_location_ids)
export(parse_meta_time_periods)
export(parse_sqids_dataset)
export(parse_sqids_filter)
export(parse_sqids_indicator)
export(query_dataset)
export(validate_ees_id)
export(validate_page_size)
export(warning_max_pages)
importFrom(data.table,":=")
cjrace marked this conversation as resolved.
Show resolved Hide resolved
7 changes: 7 additions & 0 deletions R/eesyapi-package.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#' @keywords internal
"_PACKAGE"

## usethis namespace: start
#' @importFrom data.table :=
## usethis namespace: end
NULL
10 changes: 9 additions & 1 deletion R/example_id.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,14 @@ example_id <- function(
"dataset",
"location_id",
"location_code",
"filter",
"filter_item",
"indicator",
"publication",
"dataset",
"location_id",
"location_code",
"filter",
"filter_item",
"indicator"
),
Expand All @@ -42,6 +44,8 @@ example_id <- function(
"dev",
"dev",
"dev",
"dev",
"dev",
"dev"
),
example_group = c(
Expand All @@ -51,6 +55,8 @@ example_id <- function(
"attendance",
"attendance",
"attendance",
"attendance",
"public-api-testing",
"public-api-testing",
"public-api-testing",
"public-api-testing",
Expand All @@ -63,12 +69,14 @@ example_id <- function(
"7c0e9201-c7c0-ff73-bee4-304e731ec0e6",
"NAT|id|dP0Zw",
"NAT|code|E92000001",
"hl2Gy",
"4kdUZ",
"5UNdi",
"bqZtT",
"d823e4df-626f-4450-9b21-08dc8b95fc02",
"830f9201-9e11-ad75-8dcd-d2efe2834457",
"LA|id|ml79K",
"NAT|code|E92000001",
"5mvdi",
"HsQzL",
"h8fyW"
)
Expand Down
6 changes: 4 additions & 2 deletions R/parse_api_dataset.R
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,10 @@ parse_api_dataset <- function(
api_data_result$timePeriod,
data.frame(geographic_level = api_data_result$geographicLevel),
api_data_result$locations,
api_data_result$filters,
api_data_result$values
api_data_result$filters |>
dplyr::rename_with(~ paste0("filter-", .x)),
api_data_result$values |>
dplyr::rename_with(~ paste0("indicator-", .x))
)
# Next aim here is to pull in the meta data automatically at this point to translate
# all the API codes...
Expand Down
126 changes: 126 additions & 0 deletions R/parse_sqids.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
#' Parse IDs in returned API data
#'
#' @description
#' The API uses unique IDs (sqids) to identify each filter column, filter_item and indicator
#' column. This facilitates continuity, i.e. data creators may need to change the col_name and
#' labeling in their data files, whilst maintaining the same fundamental content. Results from the
#' API are given using these IDs, but it's expected that analysts will want to translate these back
#' into the col_names assigned by analysts. `parse_data_sqids()` takes a data frame extracted from
#' the API and translates all the sqids in that data frame to their assigned col_names based on
#' the meta data available from the API.
#'
#' @param data Data frame containing data as returned from the API by `get_dataset()` or
#' `post_dataset()`
#' @param dataset_id String containing the data set ID
#' @param verbose Output status messaging for user
#'
#' @return Data frame
#' @export
#'
#' @examples
#' get_dataset(example_id(), indicators = example_id("indicator"), page = 1) |>
#' parse_sqids_dataset(example_id())
parse_sqids_dataset <- function(
data,
dataset_id,
verbose = FALSE) {
meta <- get_meta(dataset_id)
Fixed Show fixed Hide fixed
indicators <- meta |>
magrittr::use_series("indicators") |>
dplyr::mutate(col_id = paste0("indicator-", !!rlang::sym("col_id"))) |>
dplyr::filter(col_id %in% colnames(data))
Fixed Show fixed Hide fixed
indicator_lookup <- indicators |>
dplyr::pull("col_id")
names(indicator_lookup) <- indicators |> dplyr::pull("col_name")
data <- data |>
dplyr::rename(dplyr::all_of(indicator_lookup))
filters <- meta |>
magrittr::use_series("filter_columns") |>
dplyr::mutate(col_id = paste0("filter-", !!rlang::sym("col_id"))) |>
dplyr::filter(col_id %in% colnames(data)) |>
Fixed Show fixed Hide fixed
dplyr::pull("col_id")
for (column in filters) {
data <- data |>
parse_sqids_filter(meta, column, verbose = TRUE)
}
return(data)
}

#' Parse IDs in a filter column
#'
#' @description
#' The API uses unique IDs (sqids) to identify each filter column and its contents (filter items).
#' This function parses those into the data creators id and item labels based on the meta data
#' stored on the API for the data set.
#'
#' @inheritParams parse_sqids_dataset
#' @param meta Meta data for the data set as provided by `get_meta()`
#' @param column_sqid The filter col_id
#'
#' @return Data frame
#' @export
#'
#' @examples
#' parse_sqids_filter(
#' get_dataset(example_id(), indicators = example_id("indicator"), page = 1),
#' get_meta(example_id()),
#' example_id("filter")
#' )
parse_sqids_filter <- function(data, meta, column_sqid, verbose = FALSE) {
if (!grepl("filter-", column_sqid)) {
column_sqid <- paste0("filter-", column_sqid)
}
print(!grepl("filter-", column_sqid))
print(column_sqid)
col_name <- meta |>
magrittr::use_series("filter_columns") |>
dplyr::mutate(col_id = paste0("filter-", !!rlang::sym("col_id"))) |>
dplyr::filter(!!rlang::sym("col_id") == column_sqid) |>
dplyr::pull("col_name")
if (verbose) {
message("Matched ", column_sqid, " to ", col_name)
}
lookup <- meta |>
magrittr::use_series("filter_items") |>
dplyr::mutate(col_id = paste0("filter-", !!rlang::sym("col_id"))) |>
dplyr::filter(!!rlang::sym("col_id") == column_sqid) |>
dplyr::select("item_label", "item_id") |>
dplyr::rename(
!!rlang::sym(col_name) := "item_label",
Fixed Show fixed Hide fixed
Fixed Show fixed Hide fixed
!!rlang::sym(column_sqid) := "item_id"
)
data <- data |>
dplyr::left_join(lookup, by = column_sqid) |>
dplyr::select(-column_sqid)
return(data)
}

#' Parse IDs in a filter column
#'
#' @description
#' The API uses unique IDs (sqids) to identify each filter column and its contents (filter items).
cjrace marked this conversation as resolved.
Show resolved Hide resolved
#' This function parses those into the data creators id and item labels based on the meta data
#' stored on the API for the data set.
#'
#' @inheritParams parse_sqids_dataset
#' @inheritParams parse_sqids_filter
#' @param column_sqid The indicator col_id
#'
#' @return Data frame
#' @export
#'
#' @examples
#' parse_sqids_indicator(
#' example_id("indicator"),
#' get_meta(example_id()),
#' )
parse_sqids_indicator <- function(column_sqid, meta, verbose = FALSE) {
col_name <- meta |>
magrittr::use_series("indicators") |>
dplyr::filter(!!rlang::sym("col_id") == column_sqid) |>
dplyr::pull("col_name")
if (verbose) {
message("Matched ", column_sqid, " to ", col_name)
}
return(col_name)
}
1 change: 1 addition & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ reference:
contents:
- get_dataset
- parse_api_dataset
- starts_with("parse_sqid")

- title: Validation functions
desc: These functions are used across the package to validate elements being passed as part of an API url or query.
Expand Down
34 changes: 34 additions & 0 deletions man/eesyapi-package.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

32 changes: 32 additions & 0 deletions man/parse_sqids_dataset.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

33 changes: 33 additions & 0 deletions man/parse_sqids_filter.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 29 additions & 0 deletions man/parse_sqids_indicator.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading