-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement bind_namespaces strategy for Prefix.cc #2239
base: 8.x
Are you sure you want to change the base?
Changes from 10 commits
7e67a09
f3a1ece
a66538b
699b980
f872720
9c3a5ec
693f786
7db61ac
c27f7e7
2e1e607
881a5f6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
""" | ||
Prefix.cc is a community curated prefix map. By using `bind_namespace="cc"`, | ||
you can set a namespace manager or graph to dynamically load prefixes from | ||
this resource. | ||
""" | ||
|
||
import rdflib | ||
|
||
graph = rdflib.Graph(bind_namespaces="cc") | ||
|
||
# The Gene Ontology is a biomedical ontology describing | ||
# biological processes, cellular locations, and cellular components. | ||
# It is typically abbreviated with the prefix "go" and uses PURLs | ||
# issued by the Open Biological and Biomedical Ontologies Foundry. | ||
prefix_map = {prefix: str(ns) for prefix, ns in graph.namespaces()} | ||
assert "go" in prefix_map | ||
assert prefix_map["go"] == "http://purl.obolibrary.org/obo/GO_" | ||
assert graph.qname("http://purl.obolibrary.org/obo/GO_0032571") == "go:0032571" |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,7 @@ | |
from typing import TYPE_CHECKING, Any, Dict, Iterable, List, Optional, Set, Tuple, Union | ||
from unicodedata import category | ||
from urllib.parse import urldefrag, urljoin | ||
from urllib.request import urlopen | ||
|
||
from rdflib.term import URIRef, Variable, _is_valid_uri | ||
|
||
|
@@ -372,7 +373,6 @@ class NamespaceManager(object): | |
* note this is NOT default behaviour | ||
* cc: | ||
* using prefix bindings from prefix.cc which is a online prefixes database | ||
* not implemented yet - this is aspirational | ||
|
||
See the | ||
Sample usage | ||
|
@@ -418,11 +418,14 @@ def __init__(self, graph: "Graph", bind_namespaces: "_NamespaceSetString" = "cor | |
for prefix, ns in _NAMESPACE_PREFIXES_CORE.items(): | ||
self.bind(prefix, ns) | ||
elif bind_namespaces == "cc": | ||
for prefix, ns in _NAMESPACE_PREFIXES_RDFLIB.items(): | ||
self.bind(prefix, ns) | ||
for prefix, ns in _NAMESPACE_PREFIXES_CORE.items(): | ||
self.bind(prefix, ns) | ||
# bind any prefix that can be found with lookups to prefix.cc | ||
# first bind core and rdflib ones | ||
# work out remainder - namespaces without prefixes | ||
# only look those ones up | ||
raise NotImplementedError("Haven't got to this option yet") | ||
for prefix, ns in _get_prefix_cc().items(): | ||
# note that prefixes are lowercase-only in prefix.cc | ||
self.bind(prefix, ns) | ||
elif bind_namespaces == "core": | ||
# bind a few core RDF namespaces - default | ||
for prefix, ns in _NAMESPACE_PREFIXES_CORE.items(): | ||
|
@@ -719,6 +722,13 @@ def absolutize(self, uri: str, defrag: int = 1) -> URIRef: | |
return URIRef(result) | ||
|
||
|
||
def _get_prefix_cc(): | ||
"""Get the context from Prefix.cc.""" | ||
response = urlopen("https://prefix.cc/context.jsonld") | ||
context = json.loads(response.read()) | ||
return context["@context"] | ||
Comment on lines
+734
to
+738
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, it's the case that there are no special values in this context dictionary based on the way that the context is constructed (https://github.com/cygri/prefix.cc/blob/cbc85c00e59e00cf4fee697374109fdd9027231a/templates/format/jsonld.php) and the strict requirements on prefixes (though right now I am having a hard time finding where it's documented that these have to be lowercase strings of alphanumeric characters length <= 10) |
||
|
||
|
||
# From: http://www.w3.org/TR/REC-xml#NT-CombiningChar | ||
# | ||
# * Name start characters must have one of the categories Ll, Lu, Lo, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use the TLD name for
prefix.cc
?cc
is also the conventional prefix for creativecommons, so this might be confusing.(I would also prefer having this as a utility (e.g.
graph.bind_namespaces(util.get_prefix_cc()
) rather than a flag to Graph. That would be more explicit and lets the user control network access and caching.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niklasl I am happy to do whatever the RDFLib team decides on, but this interface and nomenclature was already predefined, so I just filled it in with an implementation as suggested.
I agree that since prefix.cc relies on a network connection that this is a bit of a tricky situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, you can achieve the same effect using RDFLib as is, using just:
This yields one very interesting difference though: the
go
prefix won't work, as it ends in a_
, which is not treated as a namespace prefix in JSON-LD 1.1, since it does not en with a URI gen-delim (it has to be explicitly declared using"@prefix": true
in the context to be treated as a prefix anyway).