-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating base get_meta() function to retrieve data set meta data #19
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we change the format of the parsed ones a little so they're a bit less nested? Happy with the general list format, though might be nice to do some more clean up so there's only ever one layer of nesting and all data frames in the list are easier to use (might make our own lives ees-ier for other functions too)
$locations
which currently gives level.code and level.label as well as locations, we already have $geographicLevels, so could drop the level.code / level.label, just have a single data frame with code, name and id cols for locations? Will make it print nicer in the console and be easier to reuse as a lookup?
- Could we split
$filters
into a filter_columns table and a filter_options table? Again flattening this out a bit will make it easier to reuse as a lookup and print in a more friendly way. For filter options, I'd imagine one flat table with all options, and a column for what filters they apply to, and I guess if you do that, you could leave it called filters, and not need a separate columns table if it's one big dataframe with cols likefilter_column_id
filter_column_label
filter_option_id
filter_option_label
…r::select and rename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice solution / code generally - a neat way to structure the functions so that the main user query doesn't repeat 5 calls just to get the metadata out. Few comments in the code specifically, plus:
-
Should we separate out the functions we expect users to use in the _pkgdown_yml file for the reference list? Currently it's one flat list, and I expect most users will only need a couple of the functions to start with.
-
Should we have before / after test data saved in the test folder to check the parse_meta... functions against?
I've tried some sort of structuring that makes rough sense to me as something that could be extended sensibly as we add more functionality. |
I guess so... I've added meta test data into a testdata/ folder and written a test for each of the parsing functions. For the data format, I've picked RDS as:
|
Brief overview of changes
Analysts will need a function to retrieve the basic meta data for a given data set. This adds that in the form of
get_meta()
. To support this, I've also created a first step in creating an error handling script that helps translate html connection codes.Why are these changes being made?
We need a function to connect to the meta data held on data sets on the EES API. The meta data holds the column and indicator info on a given data set, including the filter item and indicator codes required to query the dataset via the API.
Detailed description of changes
I've add the following functions:
get_meta_response()
http_request_error()
get_meta_response()
will take a dataset_id, dataset_version and api_version and deliver the meta data associated with that dataset. This can be returned as the basic query result provided by the API (parse = FALSE
) or an initial R friendly structured list contianing the results (parse = TRUE
).http_request_error()
will translate any http return codes (e.g. 200, 404, 504 etc) and translate these into a broad-brush error message. This could be expanded in the future to be more fine grained and informative, but I've kept it fairly top level for now (i.e. it only picks up whether it's 2XX, 4XX or 5XX).And following comments, I've created an extra bunch of functions to do the additional parsing I'd been saving for later PRs:
get_meta()
parse_meta_filter_columns()
parse_meta_filter_item_ids()
parse_meta_indicator_columns()
parse_meta_location_ids()
parse_meta_filter_columns()
,parse_meta_filter_item_ids()
,parse_meta_indicator_columns()
andparse_meta_location_ids()
tidy up the individual outputs in the structured list returned byget_meta_response()
into individual data frames. Finally,get_meta()
is the function I'm intending most end users to actually use and it just a wrapper that runsget_meta_response()
and then applies the 4 parse functions to it to create a single structured list of data frames.Issue ticket number/s and link
#1
And now #9, #10, #11, #12 and #13 as well.