-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using mappings to replace obsolete terms #30
Comments
Very timely. My thoughts on this right now is as follows, but I am very easily convinced that I am wrong:
That said, I would be open to arguments against the above. The main ones I can see is:
From your suggestions above, I would prefer we go with |
On principle I agree, but still it would be nice if a tool could consume a mapping set to perform replacement without requiring the users to first filter out the mappings that do not represent a term replacement (in case they somehow obtain a mapping set that contains more than what they need). Practically I envision a tool (or a pluggable ROBOT command) that can take a mapping set as input and would automatically perform term replacements, using only the mappings that have been explicitly marked somehow as being intended for such a purpose. Of course the behaviour would be user-configurable (users would be able to say, “perform replacements for all the mappings, regardless of which predicate or mapping justification they use”, or on the contrary, “perform replacements only for the mappings that are using the predicate |
Replacements due to term deprecation are only a small sub-area of the problem space. Often, we want to replace ids in raw data described with 1 vocab (say MA) with another (say Uberon). These cases are at least as frequent, if not more frequent, then replacements due to deprecation. I am not saying "yes" or "no" or anything at all; just that the "term replacement" use case is divided into two categories:
And since (2) is certainly going to use regular mapping relations, I am not 100% sure about the marginal gain of having (1) using a different "system". My personal sense is still that the client / user should know what they are doing when passing a mapping set as input to a replacement problem. |
I am not fully convinced that “regular mapping relations” (I am assuming you’re referring to things such as No special system of any kind to mark mappings as being used for replacement purposes, then. Up to the users to decide which relations they want to use depending on what they are actually doing (replacing obsolete terms or integrating data from different vocabularies), and to pass that to whatever command or script they use to actually perform the replacement. Fine with me. I’ll still make my command use |
For KG integration having exact matches between say the Orphanet Ontology (ORDO) and a Mondo is the main use case of mapping sets.. A mapping set between the ORDO and Mondo would be used to "replace" all mentions of, say, ORDO with Mondo identifiers in a dataset imported during ETL. As always I am certainly not trying to impose my will, just collecting arguments and hopefully zoning in on the right path together. |
I have two (admittedly small) concerns with using A philosophical one (aka “the unimportant one“): saying that “A must be replaced by B” (again, regardless of the reason: “because A is obsolete”, “because my application only accepts entities from the vocabulary of B”, “because I prefer B”, etc.) is not the same thing as saying “A and B refers to the same thing and can be used interchangeably“, which is the definition of an exact match according to the SKOS vocabulary. Using A practical one: Now you would tell me if that it is up to the users to know what they are doing, and to be careful when inverting mapping sets if they know that the mapping set will ultimately be used for replacement purposes – and you would clearly have a point! But I still think it’d be safer (and semantically more precise) to have a dedicated relation to represent mappings intended for entity replacement. Or several dedicated relations, if we want to distinguish between “replacements due to deprecation” (where
Neither do I, and sorry if I gave that impression. |
This is a 100% valid reason and a very good argument to use |
So you don’t consider my “philosophical” argument to be 100% valid? 😢 ( :D ) |
Hmm I can see where you are coming from, but I find the view impractical personally, even if it is conceptually justifiable. "Replacement" is a key technique in data integration, and "data integration" is such a fundamental use case for "mapping sets" that I don't really think its "overloading" to think of the match relation as a "permission to replace". For the xref argument: Its not that overloarding skos:exactMatch this way will cause the same problem. I agree that skos exactMatch should only be used in the sense "they refer to the same real world concept", and any other use, no matter for what use case, I would consider invalid. So if you use it to mean: you can replace X with Y, but X and Y do not "refer to the same real world concept", I would say "its wrong". You can use a "formally correctly defined mapping set" for the purpose of replacement, this is all I am trying to say. I am not trying to say that "the purpose of replacement" can redefine the semantics of the predicate. |
Fair enough. I am not convinced enough to make my tool automatically and silently replace any entity that is the subject of a |
Yep, fair! |
One possible use of a SSSOM mapping set is to perform mass renaming in a given database, ontology or other data vault. For example, given an ontology and a mapping set, if the IRI of an entity in the ontology matches the subject ID of a mapping in the set, then replace that IRI with the corresponding object ID.
Should we have a way to explicitly indicate that a mapping is intended to to be used for this kind of replacement? That is, instead of a mapping that merely indicates that the subject and the object are an “exact match”, we would have a mapping that explicitly indicates that the subject is to be replaced by the object – which is slightly different than saying than the subject and the object can be used interchangeably (the normal meaning of an exact match).
I can see three ways of making such a statement explicit:
a) Using
IAO:0100001
(“term replaced by”) as the mapping predicate. That’s the easiest way as it does not require anything that does not already exist.b) Having a new dedicated mapping relation in SEMAPV, such as
semapv:ReplacementTerm
or similar. It would probably be a subproperty ofskos:exactMatch
.c) Instead of using the mapping predicate, we use another field, probably
sssom:MappingJustification
. That is, the mapping predicate would remainskos:exactMatch
, but the mapping justification would be a new value likesemapv:TermReplacement
or similar.I have no strong opinion on which way would be better (though I slightly dislike c as I feel this is overloading the meaning of
sssom:MappingJustification
somehow). But I think it would be nice to have one recommended way of doing replacements with SSSOM, otherwise I am concerned that all three methods (and possibly other methods I have not thought of!) will end up being used in the wild.The text was updated successfully, but these errors were encountered: