Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Data validation when update_data components are not present in input_data #715

Open
nitbharambe opened this issue Sep 9, 2024 · 3 comments
Labels
bug Something isn't working good first issue Indicates a good issue for first-time contributors

Comments

@nitbharambe
Copy link
Member

Describe the bug

When trying to validate using assert_valid_batch_data using update_data with a component that is not present in the input_data, a KeyError is raised.

To Reproduce

from power_grid_model import initialize_array
from power_grid_model.validation import assert_valid_batch_data

input_data = {"node": initialize_array("input", "node", 1)}
update_data = {"sym_load": initialize_array("update", "sym_load", (1,1))}
assert_valid_batch_data(input_data, update_data)

Expected behavior

A clear error message with ValidationError can be given out instead

Screenshots

Error:

Cell In[3], line 3
      1 input_data = {"node": initialize_array("input", "node", 1)}
      2 update_data = {"sym_load": initialize_array("update", "sym_load", (1,1))}
----> 3 assert_valid_batch_data(input_data, update_data)

File z:\zzz\zzz\.venv\Lib\site-packages\power_grid_model\validation\assertions.py:90, in assert_valid_batch_data(input_data, update_data, calculation_type, symmetric)
     60 def assert_valid_batch_data(
     61     input_data: SingleDataset,
     62     update_data: BatchDataset,
     63     calculation_type: Optional[CalculationType] = None,
     64     symmetric: bool = True,
     65 ):
     66     """
     67     The input dataset is validated:
     68 
   (...)
     88         ValidationException: if the contents are invalid.
     89     """
---> 90     validation_errors = validate_batch_data(
     91         input_data=input_data, update_data=update_data, calculation_type=calculation_type, symmetric=symmetric
     92     )
     93     if validation_errors:
     94         raise ValidationException(validation_errors, "update_data")
...
--> 690     invalid = np.isin(data[component]["id"], ref_data[component]["id"], invert=True)
    691     if invalid.any():
    692         ids = data[component]["id"][invalid].flatten().tolist()
    
KeyError: 'sym_load'
@nitbharambe nitbharambe added bug Something isn't working good first issue Indicates a good issue for first-time contributors labels Sep 9, 2024
@petersalemink95
Copy link
Member

I agree that throwing a ValidationError here would be necessary

@TonyXiang8787
Copy link
Member

@nitbharambe maybe we need to think about this. Does raising error always be logic?

Users may treat non-existing component as zero-length array for some reasons. Maybe the logic should be: if a component exists in batch dataset but not in input, we only raise error if the width of this batch component array is not zero.

@nitbharambe
Copy link
Member Author

@nitbharambe maybe we need to think about this. Does raising error always be logic?

Users may treat non-existing component as zero-length array for some reasons. Maybe the logic should be: if a component exists in batch dataset but not in input, we only raise error if the width of this batch component array is not zero.

Yes, good point! It's better to cover that situation too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Indicates a good issue for first-time contributors
Projects
Status: No status
Development

No branches or pull requests

3 participants