Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UFRM: preserve source AOI fields in output vector #1626

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ Unreleased Changes
* Fixed a bug that, in certain scenarios, caused a datastack to be saved
with relative paths when the Relative Paths checkbox was left unchecked
(https://github.com/natcap/invest/issues/1609)
* Urban Flood Risk
* Fields present on the input AOI vector are now retained in the output.
(https://github.com/natcap/invest/issues/1600)

3.14.2 (2024-05-29)
-------------------
Expand Down
48 changes: 19 additions & 29 deletions src/natcap/invest/urban_flood_risk_mitigation.py
Original file line number Diff line number Diff line change
Expand Up @@ -553,21 +553,11 @@ def _write_summary_vector(
``None``
"""
source_aoi_vector = gdal.OpenEx(source_aoi_vector_path, gdal.OF_VECTOR)
source_aoi_layer = source_aoi_vector.GetLayer()
source_geom_type = source_aoi_layer.GetGeomType()
source_srs_wkt = pygeoprocessing.get_vector_info(
source_aoi_vector_path)['projection_wkt']
source_srs = osr.SpatialReference()
source_srs.ImportFromWkt(source_srs_wkt)

esri_driver = gdal.GetDriverByName('ESRI Shapefile')
target_watershed_vector = esri_driver.Create(
target_vector_path, 0, 0, 0, gdal.GDT_Unknown)
layer_name = os.path.splitext(os.path.basename(
target_vector_path))[0]
LOGGER.debug(f"creating layer {layer_name}")
target_watershed_layer = target_watershed_vector.CreateLayer(
layer_name, source_srs, source_geom_type)
esri_driver.CreateCopy(target_vector_path, source_aoi_vector)
target_watershed_vector = gdal.OpenEx(target_vector_path,
gdal.OF_VECTOR | gdal.GA_Update)
target_watershed_layer = target_watershed_vector.GetLayer()

target_fields = ['rnf_rt_idx', 'rnf_rt_m3', 'flood_vol']
if damage_per_aoi_stats is not None:
Expand All @@ -579,39 +569,39 @@ def _write_summary_vector(
field_def.SetPrecision(11)
target_watershed_layer.CreateField(field_def)

target_layer_defn = target_watershed_layer.GetLayerDefn()
for base_feature in source_aoi_layer:
feature_id = base_feature.GetFID()
target_feature = ogr.Feature(target_layer_defn)
base_geom_ref = base_feature.GetGeometryRef()
target_feature.SetGeometry(base_geom_ref.Clone())
base_geom_ref = None
target_watershed_layer.ResetReading()
for target_feature in target_watershed_layer:
# Target vector is SHP, where FIDs start at 0, but stats were
# generated based on GPKG reprojection, where FIDs start at 1.
# Therefore, we need to reference stats at SHP FID + 1.
stat_key = target_feature.GetFID() + 1
Copy link
Member

@emlys emlys Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for researching into this difference between Geopackage and Shapefile! This solves the problem of converting from SHP to GPKG specifically, but I'm concerned that it's not generalizable to other combinations of formats.

One solution we've used in other models is to add a unique ID attribute that we can rely on being consistent, unlike the FID, and use that as the stat key.

Another solution might be to keep the original structure of iterating over features in the source layer (like on original line 583) and creating the new fields manually (forgive me if you've already tried this). The following seems to work for me:

source_layer_defn = source_aoi_layer.GetLayerDefn()
for field_index in range(source_layer_defn.GetFieldCount()):
    source_field_defn = source_layer_defn.GetFieldDefn(field_index)
    target_watershed_layer.CreateField(source_field_defn)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point, but I wonder if we need to worry about generalizability here? The formats for these particular vectors are not user-defined; they're constrained by the model itself. The target vector is a shapefile, because it is generated with the ESRI Shapefile driver. In this version, that's via CreateCopy (line 557); on main, it's via Create (line 564). Meanwhile, the "source" vector is actually the reprojection, which is a Geopackage generated by invoking pygeoprocessing.reproject_vector with the GPKG driver (see the execute method, lines 413 and 462). So, as long as we don't change those implementation details, we should be OK (and if we do change them, the regression tests should let us know if there's a problem).

What do you think? Happy to go back to adding the fields manually if you still think that's a better approach here.


pixel_count = runoff_ret_stats[feature_id]['count']
pixel_count = runoff_ret_stats[stat_key]['count']
if pixel_count > 0:
mean_value = (
runoff_ret_stats[feature_id]['sum'] / float(pixel_count))
runoff_ret_stats[stat_key]['sum'] / float(pixel_count))
target_feature.SetField('rnf_rt_idx', float(mean_value))

target_feature.SetField(
'rnf_rt_m3', float(
runoff_ret_vol_stats[feature_id]['sum']))
runoff_ret_vol_stats[stat_key]['sum']))

if damage_per_aoi_stats is not None:
pixel_count = runoff_ret_vol_stats[feature_id]['count']
pixel_count = runoff_ret_vol_stats[stat_key]['count']
if pixel_count > 0:
damage_sum = damage_per_aoi_stats[feature_id]
damage_sum = damage_per_aoi_stats[stat_key]
target_feature.SetField('aff_bld', damage_sum)

# This is the service_built equation.
target_feature.SetField(
'serv_blt', (
damage_sum * runoff_ret_vol_stats[feature_id]['sum']))
damage_sum * runoff_ret_vol_stats[stat_key]['sum']))

target_feature.SetField(
'flood_vol', float(flood_volume_stats[feature_id]['sum']))
'flood_vol', float(flood_volume_stats[stat_key]['sum']))

target_watershed_layer.SetFeature(target_feature)

target_watershed_layer.CreateFeature(target_feature)
target_watershed_layer.SyncToDisk()
target_watershed_layer = None
target_watershed_vector = None
Expand Down
32 changes: 25 additions & 7 deletions tests/test_ufrm.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,20 +59,27 @@ def test_ufrm_regression(self):
"""UFRM: regression test."""
from natcap.invest import urban_flood_risk_mitigation
args = self._make_args()
input_vector = gdal.OpenEx(args['aoi_watersheds_path'],
gdal.OF_VECTOR)
input_layer = input_vector.GetLayer()
input_fields = [field.GetName() for field in input_layer.schema]

urban_flood_risk_mitigation.execute(args)

result_vector = gdal.OpenEx(os.path.join(
args['workspace_dir'], 'flood_risk_service_Test1.shp'),
gdal.OF_VECTOR)
result_layer = result_vector.GetLayer()

# Check that all four expected fields are there.
# Check that all expected fields are there.
output_fields = ['aff_bld', 'serv_blt', 'rnf_rt_idx',
'rnf_rt_m3', 'flood_vol']
output_fields += input_fields
self.assertEqual(
set(('aff_bld', 'serv_blt', 'rnf_rt_idx', 'rnf_rt_m3',
'flood_vol')),
set(output_fields),
set(field.GetName() for field in result_layer.schema))

result_feature = result_layer.GetFeature(0)
result_feature = result_layer.GetNextFeature()
for fieldname, expected_value in (
('aff_bld', 187010830.32202843),
('serv_blt', 13253546667257.65),
Expand All @@ -85,6 +92,11 @@ def test_ufrm_regression(self):
self.assertAlmostEqual(
result_val, expected_value, places=-places_to_round)

input_feature = input_layer.GetNextFeature()
for fieldname in input_fields:
self.assertEqual(result_feature.GetField(fieldname),
input_feature.GetField(fieldname))

result_feature = None
result_layer = None
result_vector = None
Expand All @@ -94,6 +106,11 @@ def test_ufrm_regression_no_infrastructure(self):
from natcap.invest import urban_flood_risk_mitigation
args = self._make_args()
del args['built_infrastructure_vector_path']
input_vector = gdal.OpenEx(args['aoi_watersheds_path'],
gdal.OF_VECTOR)
input_layer = input_vector.GetLayer()
input_fields = [field.GetName() for field in input_layer.schema]

urban_flood_risk_mitigation.execute(args)

result_raster = gdal.OpenEx(os.path.join(
Expand All @@ -115,9 +132,11 @@ def test_ufrm_regression_no_infrastructure(self):
result_layer = result_vector.GetLayer()
result_feature = result_layer.GetFeature(0)

# Check that only the two expected fields are there.
# Check that only the expected fields are there.
output_fields = ['rnf_rt_idx', 'rnf_rt_m3', 'flood_vol']
output_fields += input_fields
self.assertEqual(
set(('rnf_rt_idx', 'rnf_rt_m3', 'flood_vol')),
set(output_fields),
set(field.GetName() for field in result_layer.schema))

for fieldname, expected_value in (
Expand Down Expand Up @@ -218,7 +237,6 @@ def test_ufrm_explicit_zeros_in_table(self):
except ValueError:
self.fail('unexpected ValueError when testing curve number row with all zeros')


def test_ufrm_string_damage_to_infrastructure(self):
"""UFRM: handle str(int) structure indices.

Expand Down
Loading