Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UFRM: preserve source AOI fields in output vector #1626

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

emilyanndavis
Copy link
Contributor

@emilyanndavis emilyanndavis commented Sep 11, 2024

Description

Fixes #1600 by using CreateCopy to generate target vector based on input vector.

Checklist

  • Updated HISTORY.rst and link to any relevant issue (if these changes are user-facing)
    - [ ] Updated the user's guide (if needed)
  • Tested the Workbench UI (if relevant)

@emilyanndavis emilyanndavis marked this pull request as ready for review September 11, 2024 16:47
Copy link
Member

@emlys emlys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start @emilyanndavis! While this does satisfy the test by copying over the fields from the source AOI, I believe it doesn't copy over the values in those fields. You could do this manually with SetField for each field in the source, but if we can get away with using an existing function like CreateCopy, I'd recommend that.

Something like this might work:

    source_aoi_vector = gdal.OpenEx(source_aoi_vector_path, gdal.OF_VECTOR)
    esri_driver = gdal.GetDriverByName('ESRI Shapefile')
    esri_driver.CreateCopy(target_vector_path, source_aoi_vector)
    target_watershed_vector = gdal.OpenEx(target_vector_path, gdal.OF_VECTOR | gdal.GA_Update)
    target_watershed_layer = target_watershed_vector.GetLayer()
    ...

@emilyanndavis emilyanndavis changed the title Bugfix/1600 ufrm preserve source aoi fields UFRM: preserve source AOI fields in output vector Sep 30, 2024
@emilyanndavis
Copy link
Contributor Author

Thanks, @emlys for pointing out my (in retrospect) obvious mistake here! I finally managed to get this to work with CreateCopy (though not without a detour into some of the oddities of converting from shp to gpkg and back again), and I updated the test as well. Please take another look when you have a chance!

# Target vector is SHP, where FIDs start at 0, but stats were
# generated based on GPKG reprojection, where FIDs start at 1.
# Therefore, we need to reference stats at SHP FID + 1.
stat_key = target_feature.GetFID() + 1
Copy link
Member

@emlys emlys Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for researching into this difference between Geopackage and Shapefile! This solves the problem of converting from SHP to GPKG specifically, but I'm concerned that it's not generalizable to other combinations of formats.

One solution we've used in other models is to add a unique ID attribute that we can rely on being consistent, unlike the FID, and use that as the stat key.

Another solution might be to keep the original structure of iterating over features in the source layer (like on original line 583) and creating the new fields manually (forgive me if you've already tried this). The following seems to work for me:

source_layer_defn = source_aoi_layer.GetLayerDefn()
for field_index in range(source_layer_defn.GetFieldCount()):
    source_field_defn = source_layer_defn.GetFieldDefn(field_index)
    target_watershed_layer.CreateField(source_field_defn)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point, but I wonder if we need to worry about generalizability here? The formats for these particular vectors are not user-defined; they're constrained by the model itself. The target vector is a shapefile, because it is generated with the ESRI Shapefile driver. In this version, that's via CreateCopy (line 557); on main, it's via Create (line 564). Meanwhile, the "source" vector is actually the reprojection, which is a Geopackage generated by invoking pygeoprocessing.reproject_vector with the GPKG driver (see the execute method, lines 413 and 462). So, as long as we don't change those implementation details, we should be OK (and if we do change them, the regression tests should let us know if there's a problem).

What do you think? Happy to go back to adding the fields manually if you still think that's a better approach here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Urban Flood Risk drops AOI attribute table fields
2 participants