Replies: 1 comment 1 reply
-
An addendum to the above. In these examples when I use In the cases where the range of |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently in the LinkML instance metamodel there is no distinction
between the JSON objects below:
and
and (in the case of
foo
beingmultivalued
)and
This may seem odd coming from the perspective of a programming
language such as Python or frameworks like Pydantic, which distinguish
the above. Similarly for object serializations modeled after
programmatic constructs, such as JSON and YAML, where there is a
difference between all of the above.
The reason for this is that LinkML is primarily concerned about the
semantics of data, and there is no semantic distinction between a null
value and a missing value.
However, LinkML is also intended to be a pragmatic and user-friendly
framework, prizing practicality over ideology. We are open to
extending the framework to allow for a distinction here (with the
current behavior remaining the default.
However, if we were to do this, it would introduce complications in
how LinkML is used in combination with other frameworks, such as both
relational frameworks and RDF.
In relational systems, the first two cannot be distinguished without
introducing some kind of special value to mark "unset values" as
distinct from NULLs. (In fact, the inventors of relational databases,
Codd and Date, were both against the introducion of NULL).
In the case where
foo
ismultivalued
, a linking table would beintroduced; for example if the parent class is C, then the linking
table might be
C_foo
. Either this table has rows or it has not, sothere would be no way of distinguishing
nulls
(form 1) from unset(form 2) from a zero-length connection (forms 3 or 4), without
introducing some other kind of marker.
In RDF systems there is not even a concept of
NULL
. However, this isnot needed, RDF can be seen as structurally similar to a normalized
RDBMS, where each triple is a distinct assertion, and the absence of a
triple means the same as NULL. So in RDF there is no way to
distinguish the 4 forms above, without introducing some kind of ad-hoc
non-standard marker.
Note this disconnected has always existed for JSON-LD. JSON-LD
provides a way to model data in a natural JSON form, but with a
mapping to RDF.
The JSON-LD 1.1 specification has this to say in section
1.4 regarding
null
inJSON:
"A map entry in the body of a JSON-LD document whose value is null has
the same meaning as if the map entry was not defined"
This means that if we were to introduce a distinction into LinkML,
then this could only be used in combination with a subset of
frameworks. If the user tried to generate SQL DDL or JSON-LD Contexts,
then the framework should throw an error by default. Similarly, if
converting from RDF or similar, the default behavior should also be
error-throwing.
While it might be possible to extend support for RDF and Relational
frameworks via the the use of special marker slots or values, this
would introduce a lot of complication and potential impedance
mismatches. For example, in SQL Alchemy models the special marker
values would leak through, making the SQL Alchemy models no longer
isomorphic to the Pydantic models.
Distinguishing NULL forms may still be added to a later version of the
metamodel, but the tradeoffs should be understood.
Beta Was this translation helpful? Give feedback.
All reactions