Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of results does not match with size #181

Closed
b1rger opened this issue Jan 8, 2025 · 4 comments · Fixed by #194
Closed

Number of results does not match with size #181

b1rger opened this issue Jan 8, 2025 · 4 comments · Fixed by #194
Assignees

Comments

@b1rger
Copy link
Collaborator

b1rger commented Jan 8, 2025

If I have an empty model:

class Person(BaseModel):
    ...

then I get an empty result set:

{"items":[],"page":1,"size":100,"total":7489,"pages":75}

but I would expect to get 100 empty items

If I have a model with an attribute that is not part of the query:

class Person(BaseModel):
   foo: str = "foobar"

the I get a result set with only one result:

{"items":[{"foo":"foobar"}],"page":1,"size":100,"total":7489,"pages":75}

but I would expect to get 100 items with the attribute "foo: foobar", not only one

@lu-pl
Copy link
Contributor

lu-pl commented Jan 8, 2025

Please provide the full setup including the SPARQL query so that I can reproduce this case.

@lu-pl
Copy link
Contributor

lu-pl commented Jan 8, 2025

Possibly related to #176 .

@b1rger
Copy link
Collaborator Author

b1rger commented Jan 9, 2025

Please provide the full setup including the SPARQL query so that I can reproduce this case.

https://github.com/b1rger/omnipot-api

@lu-pl
Copy link
Contributor

lu-pl commented Jan 9, 2025

Ok, there are two interconnected issues here, both are already documented in #170 and #176 .

Explanation for the above behavior

The ModelBindingsMapper determines the existential condition for a model using the model_bool model config. The default for model_bool is, that a model is considered true if at least one of its fields is truthy. The rationale behind this is documented in #110.

Now, since the empty model has no fields at all, no fields are truthy, so the model is determined to not exist at all.

This behaves differently if one defines a model to be unconditionally true like so:

class Person(BaseModel):
    model_config = ConfigDict(model_bool=lambda x: True)

Note that the option for passing a callabe to model_bool will be discontinued shortly, see #176.

Then the route returns

{
    "items": [ {} ],
    "page": 1,
    "size": 100,
    "total": 7489,
    "pages": 75
}

This shows another thing: The mapper collects only unique models, so since {} is the only unique value that gets generated, it is the only value that appears under items. The result in the second model example (foo: str = "foobar") has the same issue.

This behavior is correct for aggregation, but can probably be considered a bug for top-level models.
I believe that the pending ModelBindingsMapper update/rewrite will remedy this, as it will utilize a DataFrame for SPARQL result sets and not just pure recursion.

Solution

The main thing that needs doing is to rewrite the ModelBindingsMapper using a DataFrame. The ModelBindingsMapper is an essential part of RDFProxy and the current implementation is basically the result of rapid prototyping and is actually now the oldest part of 0.2.0.

This will take some time though since dynamic and recursive DataFrame compution with grouping and aggregations based on Pydantic models is (at least for me) not an entirely trivial thing to do.

Also #176, which requires dynamic FILTER injections, is potentially interconnected with the required ModelBindingsMapper changes, so I need to carefully think about things and figure everything out.

lu-pl added a commit that referenced this issue Jan 20, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
lu-pl added a commit that referenced this issue Jan 20, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
lu-pl added a commit that referenced this issue Jan 20, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
@lu-pl lu-pl self-assigned this Jan 21, 2025
lu-pl added a commit that referenced this issue Jan 21, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
lu-pl added a commit that referenced this issue Jan 21, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
lu-pl added a commit that referenced this issue Jan 21, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
lu-pl added a commit that referenced this issue Jan 21, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
lu-pl added a commit that referenced this issue Jan 21, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
lu-pl added a commit that referenced this issue Jan 22, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
lu-pl added a commit that referenced this issue Jan 22, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
@lu-pl lu-pl closed this as completed in a24c5a1 Jan 22, 2025
lu-pl added a commit that referenced this issue Jan 22, 2025
This change introduces a refactor of the important ModelBindingsMapper
class using a pandas DataFrame for effecting grouping and aggregation
and a CurryModel utility for partially instantiating a given Pydantic
model with checks running for every partial application, allowing for
fast validation failure.

Closes #170.

Aggregation behavior is now only triggered for actually aggregated
fields, and not for top-level models as well. Therefore this also closes #181.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants