Skip to content

conversion:uses_predicate

timrdf edited this page Sep 22, 2012 · 31 revisions

What is first

  • Most of the terms in the conversion: vocabulary are conversion:Enhancements, but some terms are annotations that are created during the conversion. conversion:uses_predicate is one of those annotations.
  • conversion:uses_predicate complements the VoID Vocabulary.

What we will cover

This page will cover what the conversion:uses_predicate property describes, and a bit of background on how it is computed.

Let's get to it!

If two datasets use the same vocabulary, then there is a good chance that it will be worthwhile to combine them to get more interesting results. The conversion:uses_predicate property annotates void:Datasets with the RDF predicates that appear in the dataset's triples. For example, if the dataset http://purl.org/twc/health/source/hub-healthdata-gov/dataset/hospital-compare/version/2012-Jul-17 contains the triples:

@prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
@prefix prov:  <http://www.w3.org/ns/prov#> .

<http://purl.org/twc/health/source/hub-healthdata-gov/dataset/hospital-compare/version/2012-Jul-17/provider/010001> 
   prov:specializationOf  <http://logd.tw.rpi.edu/id/medicare-gov/provider/010001> ;
   vcard:organization-name "Southeast Alabama Medical Center" ;
   vcard:adr <http://localhost/source/hub-healthdata-gov/provider/010001/address> ;
   prov:atLocation dbpedia:Houston_County .

then the following four triples informs which predicates the dataset uses:

<http://purl.org/twc/health/source/hub-healthdata-gov/dataset/hospital-compare/version/2012-Jul-17>
   conversion:uses_predicate prov:specializationOf, vcard:organization-name, vard:adr, prov:atLocation .

Annotating many void:Datasets with conversion:uses_predicate allows us to quickly find datasets that share the same vocabulary. Using it lets us avoid generic queries that can take a long time for large datasets, such as:

select distinct ?p
where {
  graph <http://purl.org/twc/health/source/hub-healthdata-gov/dataset/hospital-compare/version/2012-Jul-17> {
     [] ?p []
  }
}

Finally, conversion:uses_predicate provides an extra level of granularity of the void:vocabulary annotation, which only references the vocabulary (e.g. vcard, prov), and not the actual terms that are used within the vocabulary (e.g. vcard:adr, prov:specialiationOf). In fact, one could derive the generic void:vocabulary' assertions by processing the detailed conversion:uses_predicate` annotations.

Example

https://github.com/timrdf/csv2rdf4lod-automation/issues/300

Looking at dataset hub-healthdata-gov/hospital-compare

rapper -g -o ntriples publish/hub-healthdata-gov-hospital-compare-2012-Jul-17.e1.ttl | awk '{print $2}' | sort -u > manual/e1-predicates.csv
rapper -g -o ntriples publish/hub-healthdata-gov-hospital-compare-2012-Jul-17.void.ttl | awk '$2 == "<http://purl.org/twc/vocab/conversion/uses_predicate>"{print $3}' | sort -u > manual/uses-predicate.csv
diff -y -W 250 manual/uses-predicate.csv manual/e1-predicates.csv

What is next

Clone this wiki locally