-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add validation function that returns invalid items rather than logging+raising #302
Comments
Thanks Jan Ivar for that comment. Can you explain a bit further what your vision for the "working further in Python with the invalid items" actually is? I'm a bit worried that defining a usable return-type is not obvious... Also, as you write yourself, getting the invalid items is only one-and-a-half lines of code, like for dim in dimensions:
invalid = getattr(dsd, dim).validate_items(getattr(df, dim))
... so you might as well include that line in your processing workflow rather than having to parse the returned object from a new validation-method. |
Yes, that's what I would do for dimensions. But I had to search through the source code to find that solution. And it's not sufficient for checking variable/unit combos (see below). A dedicated method would be more convenient for anyone who isn't already familiar with the codebase. For the dimensions, it could just return Checking variable/unit combinations, is less straight-forward. The core part of It could either be a separate item in the dict returned by the method that checks individual dimensions (e.g., with key If you want, I can add the necessary code myself and submit a pull request (I would probably make a branch and do that for my own use anyway). But in that case it would be good to get your feedback on the solution I've sketched here. |
I'd still be curious to better understand your intended use case. But yes, happy to review a PR and provide feedback - please create a fork and tinker away... |
The current processing workflow, and in particular the
validation.validate
function currently raises aValueError
and writes invalid item names to log if any item in any dimension is not contained in the corresponding codelist. Which is great for a pipeline that just validates input and rejects invalid input. But sometimes you rather want a function that returns the invalid items in a form that you can work with further in Python.Could you add such a function? E.g.,
validate.get_invalid_items
or something like that? Basically the same asvalidation.validate
, but which returns, e.g., a dict with the dimension names as keys and lists of the invalid items as values or something similar.Codelist.validate_items
already does that for a single dimension, so a workaround is to call that manually for each dimension. But using a single function to get the invalid items for all or for a specified list of dimensions would be a lot more convenient.Variable/unit combos would be a special case here. If any of those fail the check, they could maybe be returned in a dict item with key
'invalid_units'
and values that are tuples of (variable name, invalid unit), or something like that.The text was updated successfully, but these errors were encountered: