Pagination count bug related to model_bool #176

lu-pl · 2024-12-20T07:53:07Z

Problem outline

The total count slot of the Page object returned by SPARQLModelAdapter.query is computed by a generated SPARQL count query, which returns the count of all result rows.

Obviously and especially in the context of model_bool predicates, the number of total rows of the SPARQL result set does not necessarily coincide with the number of total models that are computable for a given query/model constellation.

This needs investigation. I think that this is practically mainly a problem for ungrouped models, although I think it is possible that the problem can cause faulty total counts also for grouped models.

Example

Here is a minimal example for problem reproduction:

query = """
select ?parent ?child ?name
where {
    values (?parent ?child ?name) {
        ('x' 'c' 'foo')
        ('y' 'd' UNDEF)
        ('y' 'e' UNDEF)
        ('z' UNDEF UNDEF)
    }
}
"""

class SomeModel(BaseModel):
    name: str | None = None

app = FastAPI()

@app.get("/")
def base_route(
    query_parameters: Annotated[QueryParameters, Query()],
) -> Page[SomeModel]:
    adapter = SPARQLModelAdapter(
        target="https://graphdb.r11.eu/repositories/RELEVEN",
        query=query,
        model=SomeModel,
    )
    return adapter.query(query_parameters)

This returns the following serialization:

{
      "items": [
            {
                  "name": "foo"
            }
      ],
      "page": 1,
      "size": 100,
      "total": 4,
      "pages": 1
}

The total count refers to the number of SPARQL result rows, which obviously does not align with the actual number of instantiated Pydantic objects.

I think it is safe to say, that the problem only occurs, if the top model is affected by the model_bool predicate; only then, the count query result does not necessarily coincide with the number computed models.

The text was updated successfully, but these errors were encountered:

lu-pl · 2024-12-20T07:58:40Z

A possible solution could be to generate and inject a FILTER clause based on the currently applied model_bool predicate.

Mapping the model_bool logic to corresponding FILTER clause constructors could also improve IO-bound performance, i.e. potentially reduce the size of result sets retrieved from a remote store.

lu-pl · 2024-12-20T08:05:57Z

A possible solution could be to generate and inject a FILTER clause based on the currently applied model_bool predicate.

Mapping the model_bool logic to corresponding FILTER clause constructors could also improve IO-bound performance, i.e. potentially reduce the size of result sets retrieved from a remote store.

Here is an example of a FILTER clause for the above query:

select ?parent ?child ?name
where {
    values (?parent ?child ?name) {
        ('x' 'c' 'foo')
        ('y' 'd' UNDEF)
        ('y' 'e' UNDEF)
        ('z' UNDEF UNDEF)
    }
    filter (bound(?name))
}

This correctly retrieves a single result row and a count of 1 from the derived count query.

It might not be entirely trivial to compute FILTER clauses based on model_bool predicates though, because this might require at least some amount of smart-ish model inspection.

E.g. in the above example, the clause FILTER (BOUND(?name)) is the result of knowing the default mode_bool predicate and the fact that name is the only (top-level + scalar type + optional) field of the model.

lu-pl · 2024-12-20T08:14:29Z

Note: There are currently 2 xfails in place for SPARQLModelAdapter tests:

rdfproxy/tests/tests_adapter/test_adapter_grouped_pagination.py

Line 224 in 46c4675

@pytest.mark.xfail

lu-pl self-assigned this Dec 20, 2024

lu-pl added the bug Something isn't working label Dec 20, 2024

lu-pl mentioned this issue Jan 8, 2025

Number of results does not match with size #181

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pagination count bug related to model_bool #176

Pagination count bug related to model_bool #176

lu-pl commented Dec 20, 2024 •

edited

Loading

lu-pl commented Dec 20, 2024

lu-pl commented Dec 20, 2024 •

edited

Loading

lu-pl commented Dec 20, 2024

Pagination count bug related to model_bool #176

Pagination count bug related to model_bool #176

Comments

lu-pl commented Dec 20, 2024 • edited Loading

Problem outline

Example

lu-pl commented Dec 20, 2024

lu-pl commented Dec 20, 2024 • edited Loading

lu-pl commented Dec 20, 2024

lu-pl commented Dec 20, 2024 •

edited

Loading

lu-pl commented Dec 20, 2024 •

edited

Loading