-
Notifications
You must be signed in to change notification settings - Fork 0
Dataset Alignment
This page documents the dataset alignment process performed for this project.
We preprocess the different datasets used in this project into a 150m grid. The grid is generated using Bing Tiles Map System. The main advantage of using Bing Tile is its use of "quadkeys", an indexing system that allows us to determine the x/y location of the grid from its quadkey id.
Each dataset, which can be raster or vector, are aligned to the grids using the following process:
Raster data and gridded vector data are aligned by taking the zonal statistics of the dataset (i.e. min, max, mean, median, and count) over each 150m grid tile.
For categorical vector datasets such as soil type and lithology, we also assign a value to the grid based on the polygon value with the highest intersection over each grid.
For some raster datasets such as rivers and roads, we take the distance of each grid from the nearest feature.
For census data, which is given in block form, we disaggregate the block data according to the number of households present in each grid.
For healthcare centers, we calculate isochrones from each health center, indicating the areas that can be reached within 15, 30, 45, and 60 mins of travel time
After dataset alignment, we also calculate lattice features, which is the average feature value of surrounding grid tiles. This allows the susceptibility model to take into account the properties of the surrounding area.