Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination count bug related to model_bool #176

Open
lu-pl opened this issue Dec 20, 2024 · 3 comments
Open

Pagination count bug related to model_bool #176

lu-pl opened this issue Dec 20, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@lu-pl
Copy link
Contributor

lu-pl commented Dec 20, 2024

Problem outline

The total count slot of the Page object returned by SPARQLModelAdapter.query is computed by a generated SPARQL count query, which returns the count of all result rows.

Obviously and especially in the context of model_bool predicates, the number of total rows of the SPARQL result set does not necessarily coincide with the number of total models that are computable for a given query/model constellation.

This needs investigation. I think that this is practically mainly a problem for ungrouped models, although I think it is possible that the problem can cause faulty total counts also for grouped models.

Example

Here is a minimal example for problem reproduction:

query = """
select ?parent ?child ?name
where {
    values (?parent ?child ?name) {
        ('x' 'c' 'foo')
        ('y' 'd' UNDEF)
        ('y' 'e' UNDEF)
        ('z' UNDEF UNDEF)
    }
}
"""

class SomeModel(BaseModel):
    name: str | None = None

app = FastAPI()

@app.get("/")
def base_route(
    query_parameters: Annotated[QueryParameters, Query()],
) -> Page[SomeModel]:
    adapter = SPARQLModelAdapter(
        target="https://graphdb.r11.eu/repositories/RELEVEN",
        query=query,
        model=SomeModel,
    )
    return adapter.query(query_parameters)

This returns the following serialization:

{
      "items": [
            {
                  "name": "foo"
            }
      ],
      "page": 1,
      "size": 100,
      "total": 4,
      "pages": 1
}

The total count refers to the number of SPARQL result rows, which obviously does not align with the actual number of instantiated Pydantic objects.

I think it is safe to say, that the problem only occurs, if the top model is affected by the model_bool predicate; only then, the count query result does not necessarily coincide with the number computed models.

@lu-pl lu-pl self-assigned this Dec 20, 2024
@lu-pl lu-pl added the bug Something isn't working label Dec 20, 2024
@lu-pl
Copy link
Contributor Author

lu-pl commented Dec 20, 2024

A possible solution could be to generate and inject a FILTER clause based on the currently applied model_bool predicate.

Mapping the model_bool logic to corresponding FILTER clause constructors could also improve IO-bound performance, i.e. potentially reduce the size of result sets retrieved from a remote store.

@lu-pl
Copy link
Contributor Author

lu-pl commented Dec 20, 2024

A possible solution could be to generate and inject a FILTER clause based on the currently applied model_bool predicate.

Mapping the model_bool logic to corresponding FILTER clause constructors could also improve IO-bound performance, i.e. potentially reduce the size of result sets retrieved from a remote store.

Here is an example of a FILTER clause for the above query:

select ?parent ?child ?name
where {
    values (?parent ?child ?name) {
        ('x' 'c' 'foo')
        ('y' 'd' UNDEF)
        ('y' 'e' UNDEF)
        ('z' UNDEF UNDEF)
    }
    filter (bound(?name))
}

This correctly retrieves a single result row and a count of 1 from the derived count query.

It might not be entirely trivial to compute FILTER clauses based on model_bool predicates though, because this might require at least some amount of smart-ish model inspection.

E.g. in the above example, the clause FILTER (BOUND(?name)) is the result of knowing the default mode_bool predicate and the fact that name is the only (top-level + scalar type + optional) field of the model.

@lu-pl
Copy link
Contributor Author

lu-pl commented Dec 20, 2024

Note: There are currently 2 xfails in place for SPARQLModelAdapter tests:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant