-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathProcess
58 lines (39 loc) · 2.3 KB
/
Process
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
Working on data folder hierarchy:
- raw data
- GeoSheet: the raw geocoded geosheet file from Google Team Drive
- ancillary file from source
- processing
- geographic: the raw geojson files of geocoded locations
- ancillary: modified/standardized version of projects/transactions
- merged files
- geographic: should buffer the line/road features with certain distance, and the final shapefile should only includes polygon features.
- shapefile with standardized attribute tables
- geojson format
Working on GeoSheet
- download the final geosheet from google doc: https://gist.github.com/cspickert/1650271#file-googlespreadsheets-py
- Enable Google Doc API
- Create a credential
- Install python library to access google doc: pip install --upgrade google-api-python-client
- Standardize columns: project_id, Location Name, Identified Location Type, Source URL, Geoparsing Notes, Geocoded Location Type,
GeoJSON Link or Feature ID, Geocoding and Review Notes
- Remove blank rows
- Add columns names
- Check the availability of location information (feature ID or geojson link)
Working on spatial info
- Read geojson file and download geojson to local folders, rename based on project_id (something that is identical to the feature) (done)
- Read feature ID and download geojson to local folders, name it based on project_id and feature id (done)
- Create a big geojson file with all locations geojson
- check geometry
- cannot have multiple lines/polygons
- no point feature
- no various geometry combination
- Convert geojson to shapfile
- run script1 to get GeoSheet from google doc
- run script2 to retrieve all geojson data into destination folder (run the script through terminal)
- run script3 to combine/geometry check geojson
add later:
- process report, for example, fix geometry of project_location_id xx, etc.
- make different class: geometry (check and fix), data download (google sheet), data processing
*** Unzip all geoboundaries to a destination that is online, which will be used to retrieve feature geojson ***
http://stream.princeton.edu/AWCM/LIBRARIES/pyhdf-0.8.3/doc/webpage/source/install.html
*** The download of raw geojson files should be standarded alone and run at the very beginning just in case someone delete files from github!!!