-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move validation out of execute methods #1375
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emlys thanks for helping us standardize our approaches across models. I made a number of comments here that pretty much all boil down to this one opinion:
execute
should behave as expected (or raise an exception) whether or not the args
have been validated. Because Python API users aren't required to call validate
before calling execute
.
I might be okay with penalizing them with more cryptic errors or longer waits before an exception, but we should avoid cases where invalid args could yield unexpected results.
And if we don't think Python users should be penalized that way, then we could consider adopting the approach of always calling validate
at the start of execute
. Of course the cost there is a duplicated validate
call for Workbench users or anyone else who already validated.
Let me know what you think. I know the rest of the team already discussed this.
src/natcap/invest/hra.py
Outdated
@@ -509,13 +509,9 @@ def execute(args): | |||
|
|||
if args['risk_eq'].lower() == 'multiplicative': | |||
max_pairwise_risk = max_rating * max_rating | |||
elif args['risk_eq'].lower() == 'euclidean': | |||
else: # euclidean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion this code now only makes sense when we know for certain that validate
was called first. And we don't know that for python API users. I don't see any downside to keeping in the error-handling here. It would catch a typo like 'mltiplicative'.
I think this is different from other cases where not doing validation in execute
just means the user has to wait longer before hitting an error. So it's like, "don't call validate
at your own risk". But in this case something different & unexpected could happen.
@@ -795,7 +791,7 @@ def _valuation(distance, visibility): | |||
weight * visibility[valid_pixels])) | |||
return valuation | |||
|
|||
elif valuation_method == 'exponential': | |||
else: # exponential |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion it's better to keep this as elif valuation_method == 'exponential'
and let the subsequent NameError
arise when trying to call an undefined _valuation()
. Otherwise exponential
would be applied to a typo like logarthmic
. Or the other option is to keep in the validation step that happens much earlier.
@@ -922,19 +922,13 @@ def execute(args): | |||
# Build an iterable of plain tuples: (lucode, search_radius_m) | |||
lucode_to_search_radii = list( | |||
urban_nature_attrs[['search_radius_m']].itertuples(name=None)) | |||
elif args['search_radius_mode'] == RADIUS_OPT_POP_GROUP: | |||
else: # population group mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as in a couple other place: I don't think we can assume that validate
was called, and we don't want to run the wrong block of code instead of raising an exception.
pop_group_table = utils.read_csv_to_dataframe( | ||
args['population_group_radii_table'], | ||
MODEL_SPEC['args']['population_group_radii_table']) | ||
search_radii = set(pop_group_table['search_radius_m'].unique()) | ||
# Build a dict of {pop_group: search_radius_m} | ||
search_radii_by_pop_group = pop_group_table['search_radius_m'].to_dict() | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes with line 925
} | ||
|
||
RouteDEMTests._make_dem(args['dem_path']) | ||
with self.assertRaises(RuntimeError) as cm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this test would still make sense if it checked for KeyError
instead of RuntimeError
. I think it's important that a bad algorithm raises some kind of error, so might as well keep this test around in my opinion.
'n_workers': -1, | ||
} | ||
|
||
with self.assertRaises(ValueError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment as in test_routedem
. This would now raise a NameError
, I think. And it would be bad if it raised no error at all, so it might be nice to just keep the test.
@@ -807,22 +807,6 @@ def test_polygon_overlap(self): | |||
[polygon_1, polygon_2, polygon_3], vector_path, wkt, 'GeoJSON') | |||
self.assertTrue(urban_nature_access._geometries_overlap(vector_path)) | |||
|
|||
def test_invalid_search_radius_mode(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my comment in the model, we may want to keep this test.
"One or more of the projections in the table did not match the " | ||
"projection of the base vector") | ||
return ("One or more of the projections in the table did not match " | ||
"the projection of the base vector") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we include base_vector_path
in the message instead of saying "base vector"? I'm worried it's not obvious that "base vector" is the AOI vector. And maybe the same for using table_path
instead of "table"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @emlys ; lots of useful improvements here!
Description
Fixes #1373
validate
fromexecute
execute
that's redundant to whatvalidate
doesexecute
intovalidate
. I briefly tested the UI for these models to confirm that the new validation messages show up.I did not do anything about custom validation that happens in functions other than
execute
- trying to keep the scope small.Checklist