Replies: 17 comments 26 replies
-
Related issues: |
Beta Was this translation helpful? Give feedback.
-
I believe this discussion is quickly going to point towards package managers and registries + versioning. Managing dependencies between LinkML models like code libraries would be great. Idea has already been discussed elsewhere, but this discussion may be good place to reopen it. If so, it might be worth restating the title of the discussion in those terms. A few pointers:
I'm curious about using established tools such as pip and PyPI for that purpose. Is it sensible? What would be missing? |
Beta Was this translation helpful? Give feedback.
-
I like the idea of piggy backing off an existing system but not sure python
package management is the best model. FHIR makes use of npm tooling which
might be worth looking into. I have meaning to look more into plow
…On Sat, Nov 18, 2023 at 6:07 AM Mathieu Tulpinck ***@***.***> wrote:
I believe this discussion is quickly going to point towards package
managers. Managing dependencies between LinkML models like code libraries
would be great. Idea has already been discussed elsewhere, but this
discussion may be good place to reopen it. If so, it might be worth
restating the title of the discussion in those terms.
A few pointers:
- LinkML registry: https://linkml.io/linkml-registry/registry/
- plow: https://registry.field33.com/
- Chris' 2014 blog entry:
https://douroucouli.wordpress.com/2014/03/30/the-perils-of-managing-owl-in-a-version-control-system/
I'm curious about using established tools such as pip and PyPI for that
purpose. Is it sensible? What would be missing?
—
Reply to this email directly, view it on GitHub
<#1739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAMMOISZJ3Y5NH4NEW7TZTYFC6I3AVCNFSM6AAAAAA7OSKMTOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TMMBWGYZTI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Fwiw, I dont address namespacing within a schema, but I did write a schema provider system that can handle multiple versions of a schema Including providing them from a git repo: https://github.com/p2p-ld/nwb-linkml/blob/main/nwb_linkml/src/nwb_linkml/providers/git.py And that was specific to a particular format (NWB), but I did find that working with linking together multiple versions of a schema (that in turn link to multiple versions of another schema) that I needed to do something like an NPM-like structure, except instead of recursive directories with symlinks I handled that in the provider class, mostly because I didnt want to rewrite npm lol. I think one reason this is confusing is that similar semantics are used for linkML imports as for LD prefixes - with linkML imports we can assume a like-kinded schema that schemaview knows about, but doing something similar like being able to eg. Subclass from a FOAF model would be awesome but impossible most of the time. LD made a sort of a mess of actually using ontologies, unfortunately. As far as import syntax goes I think it would be nice to be able to do something like this: imports:
- mySchema:
from:
git:
repo: https://git.example.com
path: dir/subdir/mySchema.yaml
ref: v0.1.0 # (or any tag or commit hash)
include:
- myClass # regular kind
- class: myOtherClass
as: myOtherClassWSuffix (pretend I formatted that right, im on mobile) I think that would get sorta bonkers to implement with the current flat schemaview, but if we made it recursive so each schema was resolved in a contained way then it would be v possible. See: #1839 |
Beta Was this translation helpful? Give feedback.
-
Hello, What would be the steps necessary to introduce proper namespacing ?
I guess this is a gross oversimplification. I don't know what a schema aware API would even look like. But trying to get the discussion started. greetings, |
Beta Was this translation helpful? Give feedback.
-
Here's a proposal for a syntax, what u think @cmungall : Requirements
Nice to have
Implementation
Choices
SyntaxImportsExtending the example above ( https://github.com/orgs/linkml/discussions/1739#discussioncomment-8226761 ): Regular - import all objects into current schema: imports:
- mySchema Namespaced - import all objects as imports:
- mySchema:
as: alias Include - only import the listed objects (implicit exclude all) imports:
- mySchema:
include:
- myClass
- my_slot Exclude - import all objects except excluded imports:
- mySchema:
exclude:
- myClass
- ... Include & Exclude: only import listed objects except excluded objects (apply include and then exclude) (not sure why you would want to do this, but for the sake of completeness...) imports:
- mySchema:
include:
- ClassA
- ClassB
- slotA
- slotB
exclude:
- slotB
- ClassA
# imports ClassB and slotA Aliasing and Include/Exclude:
imports:
- mySchema:
as: schemaAlias
include:
- ClassA:
as: ClassAlias
- slotB
# alternatively...
include:
- name: ClassA
as: ClassAlias
- slotB
# imports schemaAlias.ClassAlias and schemaAlias.slotB ProvidersCurrently import Support ability to specify where a schema comes from using a
First pass implementation:
Import from local path: either use current syntax, or imports:
- mySchema:
from:
path: ../../mySchema.yaml Import from URL imports:
- mySchema:
from:
url: https://example.com/v1.0.1/mySchema.yaml Import from URL with hash validation imports:
- mySchema:
from:
url: https://example.com/v1.0.1/mySchema.yaml
hash:
sha256: (long sha string) Pass other parameters to the provider plugin (eg. setting request headers) imports:
- mySchema:
from:
url: https://example.com/v1.0.1/mySchema.yaml
headers:
"Accept": "text/yaml" Import from HEAD in default branch of git repository (dictionary form, plugin does not allow for positional argument) imports:
- mySchema:
from:
git:
repo: https://git.example.com
path: dir/subdir/mySchema.yaml Import from specific ref of git repository (including branches, tags, hashes. we can make aliases for imports:
- mySchema:
from:
git:
repo: https://git.example.com
path: dir/subdir/mySchema.yaml
ref: v1.0.1 Multiple import locations - try in order: imports:
- mySchema:
from:
- url: https://example.com/v1.0.1/mySchema.yaml
- git:
repo: https://git.example.com
path: dir/subdir/mySchema.yaml
ref: v1.0.1 CaveatsLet's not do a whole version and dependency resolution system rn. this can be made totally orthogonal, here are some examples showing how we might do that in the future: - mySchema:
version: v1.0.1
from:
- url: https://example.com/{version}/mySchema.yaml - mySchema:
version: v1.0.1
from:
git:
repo: https://git.example.com
path: dir/subdir/mySchema.yaml
ref: "{version}" We might also want to be able to override how another schema is sourcing a given schema. Say we have a vendored copy and don't want a schema we import to source it from HTTP. we could also do something like: provides:
- localSchema:
from: ./schema/localSchema.yaml |
Beta Was this translation helpful? Give feedback.
-
I am a bit new here, but I think I understand the metadata model enough to suggest that a SchemaView might want to support a namespace declaration that in turn specifies the resolution mechanism for the namespace imported schema (i.e. use the namespace as an indirection). Using this approach, it would be the responsibility of the importing schema to resolve the namespace collisions that may occur. Actually, as I read the above description, I think it's very similar. However, I do this it is worth considering NOT making the schemaview recursive (e.g. schema1.schema2.schema3.MyClass). At least not directly recursive. I think allowing the namespace to exist as a first class construct and even mapping more than one "imported schema" to map to the same namespace name would be very powerful and align to code generation more closely (e.g. Java or Python package names, etc). I will admit I am very "ignorant" of the implementation costs, but it seems like having a "relationship slot" like "binds_to_namespace" and Namespace as a formal metamodel element will smooth over a lot of modeling complexity (but maybe not implementation complexity). |
Beta Was this translation helpful? Give feedback.
-
This topic is very important to our company for adopting LinkML, so I'd like to weigh in. Before getting more acquainted with the technical side of things, I'd also like to discuss a potential different direction than those proposed here. LinkML already supports LD URI mappings and CURIEs, and promotes itself as LD modeling language, so why can't we leverage that for native/local namespacing? This way we rely on only one namespacing mechanism in the entire schema (and a battle tested and well-documented, famous one), as opposed to introducing an entire custom construction with its own syntax (e.g. For example, this might look like: prefixes:
ex: http://example.com#
schema: https://schema.org/
imports:
- ./other_schema.yaml # Contains `other:occupation`
default_prefix: ex
classes:
ex:Person:
class_uri: schema:Person
name: Bart # ex:name
other:occupation: Data Architect I think this greatly simplifies everything, but this may be wishful thinking without carefully considering the technical implications/difficulties. There may be YAML parsers who don't like the colon for example. Also, it's very well possible that my familiarity with Semantic Web tech makes me blind towards how this might make LinkML less accessible again. For example, it may trip people up that the class has the URI Anyways, I think this is a worthwile way forward to look at. Please let me know if this has any merit to it or simply won't work in your opinion. I'll read up the other proposals in more depth soon as well. Thanks! PS I'm aware this doesn't solve everything about the inflexibility of the import mechanism, but it solves the identification and namespacing issue neatly. |
Beta Was this translation helpful? Give feedback.
-
I had a closer look at the metamodel. In the linkml metamodel, slots and classes are an Element, for instance the "range" slot an "element" is defined here and has a name which now is the identifier the Element also has a definitionURI which is not the identifier, but which is filled in by the schemaloader/schemaview. It can be a CURIE in addition to an URI, and would be the right identifier to support namespacing. If, internally, all CURIES are expanded to URIs then aliasing (as suggested above) would be easy to support. I think the most difficult question remains how to make those changes without breaking the whole schemaview/schemaloader API. |
Beta Was this translation helpful? Give feedback.
-
@pkalita-lbl are you monitoring this discussion? Do you think namespacing would make it easier to manage the relationships between the nmdc-schema and the submisison-schema? |
Beta Was this translation helpful? Give feedback.
-
Sorry to answer out of thread, a lot of great discussion here. Regarding schemaload/schemaview: the former will gradually be eclipsed by the latter, and the latter can always be extended in a backwards compatible way e.g. new options on calls. LinkML already has a mechanism for mapping local names in an individual schema file to global names. It so happens that these global names are IRIs but this can be hidden from the user. A convenient way to see the mapping table for any schema file is to run There are ways to alter the mappings using elements such as: See also the compliance tests introduced in #1987 (still incomplete) that test for various combinations of these in combination with import. (unfortunately the outputs of these tests are not yet visible unless you run them but the goal is to publish these on a separate compliance suite site) This gives the schema author fine grained control over the names in the schema but less over imported names. But this could be addressed by
As far as I can tell the only impedance mismatch with package namespaces in programming languages is the case where in linkml you can have two schema files with the same Note that the average developed doesn't need to know much about URIs. They of course need to provide a stable URI for their own namespace but that is already the case. This would be the canonical way of handling a local name clash: schema_a.yaml: id: https://example.org/a
prefixes:
linkml: https://w3id.org/linkml/
schema_a: https://example.org/a/
default_prefix: schema_a
classes:
Person: schema_b.yaml:
the pydantic would look like: schema_a.py class Person(BaseModel): schema_b.py import schema_a
class Person(schema_a.Person): the json-ld-context would be like schema_b.context.yaml @context:
schema_a: https://example.org/a/
schema_b: https://example.org/b/
Person: https://example.org/b/Person This has always been the intent, though not well documented. I will review all the discussion here before Thursday's community call |
Beta Was this translation helpful? Give feedback.
-
Just to add - namespace support would also be useful for adoption of LinkML as the schema language in NWB (neurophysiology) because the current NWB system allows users to create their own schema/namespaces, and there could be occasional name clashes when using multiple extensions. |
Beta Was this translation helpful? Give feedback.
-
@sneakers-the-rat, I just wanted to say that as a fellow night owl, I feel your pain. We had to choose a time that works for people in North America but also in other parts of the world, and 8am PT is a good compromise. Thanks for all your contributions to LinkML! |
Beta Was this translation helpful? Give feedback.
-
Just jotting this down while I remember: linkml already does use the dot notation for selecting props/slots within models - https://linkml.io/linkml-model/latest/docs/specification/02instances/#instance-accessor-syntax - so we already are in a case where we might be mixing delimiters ( |
Beta Was this translation helpful? Give feedback.
-
I’m evaluating LinkML at my company to use it for building an internal federated ontology as the base for a Knowledge Graph, we see having proper namespace support as a must, would love to contribute, but quite nee to LinkML |
Beta Was this translation helpful? Give feedback.
-
Just subscribed to this issue as this is still a feature we would greatly appreciate for our Gaia-X Ontology, I previously opened an issue about this which I managed to circumvent but another similar case appeared recently. We are willing to contribute to this if you need some help 😉 |
Beta Was this translation helpful? Give feedback.
-
I made an incrementally implementable, backwards compatible proposal that as far as I can tell addresses the needs articulated here with a bunch of room for feature add benefits a few months ago but not really sure where this decisionmaking process stands since there was no comment on it |
Beta Was this translation helpful? Give feedback.
-
LinkML uses namespacing/prefix aware identifiers when referencing anything outside the model (e.g.
class_uri: schema:Person
). It also optionally uses namespacing/prefixes to refer to other schemas that can be imported.However, for any particular schema, the namespace is flat. Furthermore, imports work like
import *
and everything is imported into the same namespace.This has a lot of advantages in terms of simplicity. However it can confound users.
import Person from schema_org
import schema.Person as SchemaPerson
import schema_org
), and refer to elements using namespaced identifiers (e.g.schema_org.Person
)Having one or more of these mechanisms would make it easier to reuse schemas, without worrying about name clashes
We should discuss ways of enabling one or more of these. We should also reason through the implications. For example, slot aliasing or namespacing could be problematic when working with json, which doesn't support namespacing.
Note also that some of the renaming use cases may be better served by profiling / linkml-transformer.
Some existing issues:
Beta Was this translation helpful? Give feedback.
All reactions