lupl/mapper refactor #194

lu-pl · 2025-01-20T09:01:57Z

Closes #170 , closes #181 .

This change introduces a refactor of the important ModelBindingsMapper class using a pandas DataFrame for effecting grouping and aggregation and a CurryModel utility for partially instantiating a given Pydantic model with checks running for every partial application, allowing for fast validation failure. Closes #170. Aggregation behavior is now only triggered for actually aggregated fields, and not for top-level models as well. Therefore this also closes #181.

Some model sanity checking was implemented in the old ModelBindingsMapper, all checking should be done in a dedicated model sanity checking pipeline, see issue #108. So these tests - for now - are xfails.

The expected data was actually wrong and passed a buggy behavior in the old ModelBindingsMapper. The change fixes the test to expect the correct result. Note that this test case is somewhat contrived, because the binding data is unordered and actual data from an RDFProxy-modified query for grouped models would be ordered. The refactored mapper uses a pd.DataFrame for grouping, so (unlike a possible low-level solution with itertools.groupby) the above case can be handled, because dataframes implement efficient grouping also across unordered row series. This is actually a quite powerful feature and makes the ModelBingingsMapper class generally useful, not just in the context of RDFProxy.

As indicated in the docstring for ModelBindingsMapper, the class is somewhat coupled to SPARQLModelAdapter, because ModelBindinsMapper does not run model sanity by itself - sanity checking should happen in SPARQLModelAdapter, i.e. as early as possible. See issue #108. Once sanity checking is implemented, RDFProxy will make a public ModelBindingsMapper class available which will run model sanity checking itself.

Concerns #181.

This test is related to a fixed bug in _ModelBindingsMapper._instantiate ungrouped_model_from_row, where default values were consulted on falsy field values, i.e. also on empty strings and None field values.

lu-pl force-pushed the lupl/mapper-refactor branch 2 times, most recently from 65ce2de to aaaba46 Compare January 20, 2025 12:30

chore(deps): install pandas

e5eca85

lu-pl force-pushed the lupl/mapper-refactor branch from aaaba46 to 576b0cf Compare January 20, 2025 16:00

lu-pl marked this pull request as ready for review January 21, 2025 15:01

lu-pl requested a review from kevinstadler January 21, 2025 15:01

lu-pl force-pushed the lupl/mapper-refactor branch 6 times, most recently from 57b246e to bc91244 Compare January 22, 2025 08:08

lu-pl added 8 commits January 22, 2025 10:26

test: mark model sanity tests xfail

f1403d9

Some model sanity checking was implemented in the old ModelBindingsMapper, all checking should be done in a dedicated model sanity checking pipeline, see issue #108. So these tests - for now - are xfails.

test: introduce tests for grouped models with non-aggregated nesting

6713b91

refactor: remove obsolete functions from mapper_utils

933da4a

test: implement empty model and default-only model mapper tests

bcbb4c0

Concerns #181.

test: add tests for empty/falsy string and None fields

3b338b0

This test is related to a fixed bug in _ModelBindingsMapper._instantiate ungrouped_model_from_row, where default values were consulted on falsy field values, i.e. also on empty strings and None field values.

lu-pl force-pushed the lupl/mapper-refactor branch from bc91244 to 3b338b0 Compare January 22, 2025 09:28

kevinstadler approved these changes Jan 22, 2025

View reviewed changes

lu-pl merged commit 1accdca into main Jan 22, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lupl/mapper refactor #194

lupl/mapper refactor #194

lu-pl commented Jan 20, 2025

lupl/mapper refactor #194

lupl/mapper refactor #194

Conversation

lu-pl commented Jan 20, 2025