-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement bind_namespaces strategy for Prefix.cc #2239
base: 8.x
Are you sure you want to change the base?
Conversation
def _get_prefix_cc(): | ||
"""Get the context from Prefix.cc.""" | ||
response = urlopen("https://prefix.cc/context.jsonld") | ||
context = json.loads(response.read()) | ||
return context["@context"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's the case that there are no special values in this context dictionary based on the way that the context is constructed (https://github.com/cygri/prefix.cc/blob/cbc85c00e59e00cf4fee697374109fdd9027231a/templates/format/jsonld.php) and the strict requirements on prefixes (though right now I am having a hard time finding where it's documented that these have to be lowercase strings of alphanumeric characters length <= 10)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to add cc
to
rdflib/rdflib/_type_checking.py
Line 29 in 4aaaa6f
_NamespaceSetString = PyLiteral["core", "rdflib", "none"] |
And also update
rdflib/rdflib/namespace/__init__.py
Lines 373 to 375 in 4aaaa6f
* cc: | |
* using prefix bindings from prefix.cc which is a online prefixes database | |
* not implemented yet - this is aspirational |
Will try to get this into the next release, I will make any changes needed directly to your branch. |
@aucampia thank you for the feedback! Feel free to make any edits you think are appropriate or request I make some updates. |
@cthoyt I will likely make another patch release soon, so will hold back a bit on merging this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, though I may add a warning that indicates that the "cc" selector is offered as a best-effort feature and may stop working if prefix.cc
stops working or if they change their interface.
@@ -418,11 +418,14 @@ def __init__(self, graph: "Graph", bind_namespaces: "_NamespaceSetString" = "cor | |||
for prefix, ns in _NAMESPACE_PREFIXES_CORE.items(): | |||
self.bind(prefix, ns) | |||
elif bind_namespaces == "cc": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use the TLD name for prefix.cc
? cc
is also the conventional prefix for creativecommons, so this might be confusing.
(I would also prefer having this as a utility (e.g. graph.bind_namespaces(util.get_prefix_cc()
) rather than a flag to Graph. That would be more explicit and lets the user control network access and caching.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niklasl I am happy to do whatever the RDFLib team decides on, but this interface and nomenclature was already predefined, so I just filled it in with an implementation as suggested.
I agree that since prefix.cc relies on a network connection that this is a bit of a tricky situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, you can achieve the same effect using RDFLib as is, using just:
from rdflib import Graph
graph = Graph()
graph.parse('https://prefix.cc/context.jsonld')
for pfx, ns in graph.namespaces():
print(pfx, ns)
This yields one very interesting difference though: the go
prefix won't work, as it ends in a _
, which is not treated as a namespace prefix in JSON-LD 1.1, since it does not en with a URI gen-delim (it has to be explicitly declared using "@prefix": true
in the context to be treated as a prefix anyway).
Since we had this in the docstring, I'm going to merge it (possibly with some minor changes), though possibly it would have been better in |
Actually, I think it would be better to just move this into
Maybe something better can also be done, but I would rather have it in |
PRs to V6 is closed until further notice. See this for more details: |
We will be open for PRs again once this is resolved: |
Summary of changes
This PR implements a web-dependent loader for prefix-namespace definitions from Prefix.cc by parsing its context document (https://prefix.cc/context.jsonld)
Demo
The Gene Ontology is a biomedical ontology describing biological processes, cellular locations, and cellular components.
It is typically abbreviated with the prefix "go" and uses PURLs issued by the Open Biological and Biomedical Ontologies (OBO) Foundry. Prefix.cc has an entry for GO that uses its preferred OBO PURL.
Checklist
the same change.
./examples
for new features.CHANGELOG.md
).so maintainers can fix minor issues and keep your PR up to date.