-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate data sqids #34
Conversation
…icators and filters potentially having the same sqid)
…g it over with a cup of tea. Think this works cleaner.
dplyr::filter(!!rlang::sym("col_id") == column_sqid) |> | ||
dplyr::select("item_label", "item_id") |> | ||
dplyr::rename( | ||
!!rlang::sym(col_name) := "item_label", |
Check warning
Code scanning / lintr
no visible global function definition for ':=' Warning
dplyr::filter(!!rlang::sym("col_id") == column_sqid) |> | ||
dplyr::select("item_label", "item_id") |> | ||
dplyr::rename( | ||
!!rlang::sym(col_name) := "item_label", |
Check warning
Code scanning / lintr
no visible global function definition for ':=' Warning
…it a bit more accurate and clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments inline in the code, and then some bigger ones, which you would probably have been expecting, happy for these to go into separate issues to pick up at a later point if they're not quick to add in this PR:
- We should add parsing for time, including a lookup data set for codes to identifiers
- Locations parsing needs more refinement, would suggest a lookup data set for geographic level to code, including the prefix for each level's locations columns and also special handling for new_la_code / old_la_code and other less standard locations (RSC regions come to mind)
- Should we give an option for whether or not to parse the output from
query_dataset()
? It should be the default to parse it, though might be nice to have the option to see the raw version too, will also be a good way to demonstrate how the IDs work by showing both versions as examples to users - Performance feels like it's going very backwards here, will definitely want a rewrite at a later point, though agree it is best to just get something out for testing and then optimise once we're more certain this will definitely be the approach we're running with
query_dataset(example_id(), indicators = c("E40qF", "QuVwb", "9Aw4v"))
Microbenchmarking before:
min lq mean median uq max neval
72.3144 75.9789 82.20526 77.926 82.88525 251.242 100
After
min lq mean median uq max neval
184.6963 193.3727 203.332 199.0265 209.6245 343.5782 100
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
Brief overview of changes
Adding in some translation of SQIDs into understandable labels and codes based on a data set's meta data.
Why are these changes being made?
SQIDs don't make any sense to end users, they're going to want them translating and we can save them the effort.
Detailed description of changes
I've added some parsing functions to handle the different types of objects. These are called in by
parse_api_dataset()
. Basic structure is along the lines:query_dataset()
, which then runspost_dataset()
orget_datatset()
post_dataset()
andget_datatset()
translate the IDs on the fly usingparse_api_dataset()
parse_api_dataset()
then calls the following sub-functions:parse_sqids_locations()
parse_sqids_filters()
parse_sqids_indicators()
Additional information for reviewers
Related to discussion earlier, this isn't great for performance for large (paged) data sets as it runs this for every page. It would ultimately be better to run at the end of the process, but there's some not-insignificant rejigging in order to do that. Happy to look at that, but might be something for a separate issue after the initial bit of internal testing has started.
Issue ticket number/s and link
Resolves #29