Skip to content

Commit

Permalink
add extra notes on when to use our function v when to use boundr
Browse files Browse the repository at this point in the history
  • Loading branch information
cjrace committed Oct 20, 2024
1 parent 4c5e68b commit 306c9ed
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ Where we can, we use their API to get the data, so that we have completely repro

On the [ONS Open Geography portal](https://geoportal.statistics.gov.uk/), you will usually be looking for data published as a feature or feature layer, as these are the ones made available via the API connection. You'll be able to preview the data in the browser and do basic searching / filtering on the table if you want to visualise it. Any feature data should have an option somewhere for 'I want to use this data' (or something similar if they update their website design) where you can get to an API explorer that allows you to run a basic query in the browser. In here you can usually find the dataset_id and also the parameters you want to use to get the data you need.

We have a `get_ons_api_data()` function that acts as a wrapper to the ONS API, it does things like converting readable parameters into a query string and also handles batching and multiple requests if needed, so you get all of the data in one nice neat data frame (there's a limit on the rows per single query for the API). If you're looking to expand on this function at all, you should first check if the [boundr package](https://github.com/francisbarton/boundr) does what you need, as that gives a number of methods for extracting data from the portal as well.
We have a `get_ons_api_data()` function that acts as a wrapper to the ONS API, it does things like converting readable parameters into a query string and also handles batching and multiple requests if needed, so you get all of the data in one nice neat data frame (there's a limit on the rows per single query for the API). Make use of this function first, though if you're looking to expand on this function at all or there's anything the `get_ons_api_data()` function doesn't do that you'd like it to, you should check if the [boundr package](https://github.com/francisbarton/boundr) does what you need, as that gives a number of methods for extracting data from the portal as well. If neither our existing function or the boundr package do what you need, then we can look at raising a suggestion on dfeR if it's a DfE specific request, or on boundr if it's a more general request.

The way ONS publish has varied over their first few years of publishing, and on top of that each data set has an individual API connection for every year of boundaries. As there's no link over time from the ONS side we have helper functions defined in R/datasets_utils.R that wrap these up into a single neat time series bundle for us. Given the likelihood of further variations, don't be too surprised if adding new years to the data sets results in errors first time around, some manual fudgery is often needed so roll up your sleeves and prepare to get elbow deep into the murky depths of the R/datasets_utils.R file!

Expand Down

0 comments on commit 306c9ed

Please sign in to comment.