-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create scaper for pulling new rows out of data.dc.gov #200
Comments
This seems to be a pre requisite, so researching this one instead. |
http://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/Public_Service_WebMercator/MapServer/34 - Campaign Contributions http://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/Public_Service_WebMercator/MapServer/35 - Campaign Expenditures Use of https://github.com/openaddresses/pyesridump to grab the latest data. esri2geojson http://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/Public_Service_WebMercator/MapServer/34 ocf-contributions.geojson esri2geojson http://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/Public_Service_WebMercator/MapServer/35 ocf-expenditures.geojson |
Added ocf-expenditures.geojson to http://data.codefordc.org/dataset/dc-campaign-expenditures-ocf |
Attempt an add of ocf-contributions but failed with 413 response; will try to split the file and upload. |
Looking good. I would take a look at the datastore API that can more gracefully handle pushing a lot of rows |
Upload idea using existing tools from esri to geojson to csv to data portal |
Does that have an API? |
The dataset is too big to be proxied and will not be queryable. Currently, the data has been pushed through 1/1/2017 but a scaper needs to be written to do this regularly.
Tasks:
Upload idead using existing tools from esri to geojson to csv to data portal
https://www.npmjs.com/package/esri-dump
https://www.npmjs.com/package/json2csv
https://www.npmjs.com/package/ckan
Get the last imported data in data portal
Check if esri data exist beyond the last imported data
If data exist, attempt to get a dump from that start point to the end
Since the data in ckan is current csv, convert the data
Upload data to data portal
This script could then run to sync data portal info with esri.
The text was updated successfully, but these errors were encountered: