Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support pydantic (de)serialization on MetadataFilter #476

Merged
merged 2 commits into from
Aug 6, 2024

Conversation

pmeier
Copy link
Member

@pmeier pmeier commented Aug 5, 2024

The is a prerequisite for using MetadataFilter as part of FastAPI. I've scrapped the actual JSON serialization, because that would create JSON inside JSON with pydantic and pydantic knows how to serialize dicts anyway.

Copy link
Contributor

@nenb nenb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was working through this, and I was a little bit dissatisfied that dict_schema was only able to do a very basic check on input schemas eg Model(mf={"IN": {}}) passes the input schema check, but then throws an obscure validation error.

It then occurred to me, do we really need the children key?

For example, instead of

{'OR': {'children': [{'RAW': {'filter': 'raw'}}, {'GE': {'key': 1}}]}}

could we have

{'OR': [{'RAW': {'filter': 'raw'}}, {'GE': {'key': 1}}]}

If so, perhaps this could simplify some of the pydantic logic here?

@nenb
Copy link
Contributor

nenb commented Aug 5, 2024

Also, I haven't looked into it, but a lot of tests seem to fail. Not sure what is going on...

@pmeier
Copy link
Member Author

pmeier commented Aug 5, 2024

It then occurred to me, do we really need the children key?

Nope

class MetadataFilter:
# These are just to be consistent. The actual values have no effect.
_RAW_KEY = "filter"
_CHILDREN_KEY = "children"

Per comment, the same for the raw filter. So you can break down your example even more to

{'OR': [{'RAW': 'raw'}, {'GE': {'key': 1}}]}

That is as concise as it gets. However we also need to consider the use case of translating it. As discussed offline, I dislike the bits you highlighted as well. I invite everyone to come up with a better scheme and I'm happy to adopt it.

Also, I haven't looked into it, but a lot of tests seem to fail. Not sure what is going on...

Just minor stuff that should be fixed by #461.

@pmeier
Copy link
Member Author

pmeier commented Aug 6, 2024

Thinking about this more, I realized that there is no requirement for the primitive version to have a 1-to-1 relationship to the Python object. Meaning, we can go for the simplifications proposed in #476 (review) and #476 (comment) and still keep one Python object that always has the operator, key, and value fields.

I implemented that in 70ac0b5. Here is the example from above:

import json

from ragna.core import MetadataFilter

metadata_filter = MetadataFilter.or_(
    [
        MetadataFilter.raw("raw"),
        MetadataFilter.ge("key", 1),
    ]
)

print(metadata_filter)
print(json.dumps(metadata_filter.to_primitive()))
OR(
  RAW('raw'),
  GE('key', 1),
)
{"OR": [{"RAW": "raw"}, {"GE": {"key": 1}}]}

@pmeier pmeier merged commit f2dac6d into corpus-dev Aug 6, 2024
14 of 21 checks passed
@pmeier pmeier deleted the metdata-filter-pydantic branch August 6, 2024 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants