Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add importer and support for ONS data #171

Merged
merged 17 commits into from
Sep 27, 2023
Merged

Add importer and support for ONS data #171

merged 17 commits into from
Sep 27, 2023

Conversation

KludgeKML
Copy link
Contributor

@KludgeKML KludgeKML commented Dec 8, 2022

(Rendered view of ADR: https://github.com/alphagov/locations-api/blob/ons-import/docs/adr/0001-add-ons-import.md)

Adds a rake task that will kick off a worker to retrieve the current ONSPD datafile from a given address, then extract the multi_csv version of the files into an S3 bucket and kick off one new worker for each file. Workers import ONS data and create Postcode records for anything missing from the database. Returned information from client now also includes source (either "Ordnance Survey" or "Office for National Statistics"), and retired info (true/false, and retired_on date).

Todo:

  • Handle situations where we think a postcode should exist in OS Places API (small, non-retired), but hasn't yet been imported into Locations API
  • Add tests for ONS data details into client_spec
  • Free up importer to read all 2.6 m ONS postcode records
  • Update pact tests (also needs work in gds-api-adapter) to handle returning retired status and source.

Trello:
https://trello.com/c/nLhD5e2N/1872-spike-into-using-onspd-data-to-fill-in-postcode-gaps-in-locations-api

⚠️ This repo is Continuously Deployed: make sure you follow the guidance ⚠️

Copy link

@leenagupte leenagupte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach for storing these postcodes sounds good. I just have a few inline questions.

docs/adr/0001-add-ons-import.md Outdated Show resolved Hide resolved
docs/adr/0001-add-ons-import.md Outdated Show resolved Hide resolved
docs/adr/0001-add-ons-import.md Outdated Show resolved Hide resolved
Copy link

@leenagupte leenagupte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ADR sounds good. 👍

Copy link
Contributor

@1pretz1 1pretz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal and general plan looks good to me, nice work putting this together @KludgeKML!

I've left a few comments on the ADR either suggesting clarifications or asking questions. Have also left a few drive by comments on the code but do acknowledge that it is currently draft.

docs/adr/0001-add-ons-import.md Outdated Show resolved Hide resolved
docs/adr/0001-add-ons-import.md Outdated Show resolved Hide resolved
docs/adr/0001-add-ons-import.md Outdated Show resolved Hide resolved
docs/adr/0001-add-ons-import.md Show resolved Hide resolved
docs/adr/0001-add-ons-import.md Outdated Show resolved Hide resolved
docs/adr/0001-add-ons-import.md Outdated Show resolved Hide resolved
app/workers/ons_download_worker.rb Outdated Show resolved Hide resolved
app/workers/ons_download_worker.rb Outdated Show resolved Hide resolved
lib/os_places_api/client.rb Outdated Show resolved Hide resolved
lib/os_places_api/client.rb Outdated Show resolved Hide resolved
@KludgeKML KludgeKML force-pushed the ons-import branch 3 times, most recently from 599e081 to 24fbdbb Compare September 20, 2023 10:32
@KludgeKML KludgeKML marked this pull request as ready for review September 20, 2023 13:28
@KludgeKML KludgeKML force-pushed the ons-import branch 8 times, most recently from de253e1 to 736cb0d Compare September 25, 2023 15:41
Copy link
Contributor

@1pretz1 1pretz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Keith! Following on from our mob review, I've gone through and left a few comments but nothing major!

app/models/postcode.rb Outdated Show resolved Hide resolved
app/models/postcode_manager.rb Outdated Show resolved Hide resolved
lib/tasks/import_ons_data.rake Outdated Show resolved Hide resolved
app/workers/ons_import_worker.rb Outdated Show resolved Hide resolved
app/workers/ons_import_worker.rb Outdated Show resolved Hide resolved
app/presenters/onspd_locations_presenter.rb Outdated Show resolved Hide resolved
spec/lib/tasks/populate_postcodes_table_spec.rb Outdated Show resolved Hide resolved
app/workers/ons_download_worker.rb Outdated Show resolved Hide resolved
- Defaults to safe values for existing postcodes
- Add scopes
- Manager handles creation and update of postcode records
- Presenter handles rendering of OS Places API postcodes
- removes logic around postcode management now handled by PostcodeManager
- refactors logic specific to OS Places API location sets into a new class
- New sidekiq queue (queue_ons) for these tasks.
- DownloadWorker task retrieves an OSNPD zip file (by URL), extracts the individual CSV files to an S3 bucket, then kicks off as many ImportWorkers as needed.
- ImportWorker reads a CSV file of postcodes, creating records for any postcode not already in the database.
Copy link
Contributor

@1pretz1 1pretz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments - LGTM 🎉

@KludgeKML KludgeKML merged commit 57fb5e4 into main Sep 27, 2023
4 checks passed
@KludgeKML KludgeKML deleted the ons-import branch September 27, 2023 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants