Skip to content

Module of the HeFQUIN query federation engine to connect to Property Graph data sources.

License

Notifications You must be signed in to change notification settings

LiUSemWeb/HeFQUIN-PGConnector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HeFQUIN-PGConnector

Module of the HeFQUIN query federation engine to connect to Property Graph data sources. In particular, this module provides the functionality to run SPARQL-star queries over user-configurable RDF-star views of Property Graphs by translating these queries into openCypher queries which, then, are sent directly to a Property Graph store (we only support Neo4j at the moment).

This module also comes with two command-line programs:

  • RunBGPOverNeo4j provides the SPARQL-star query functionality for Property Graphs as a standalone tool, and
  • MaterializeRDFViewOfLPG can be used to convert the Property Graphs into an RDF-star graphs by materializing the RDF-star views that are otherwise considered only as virtual views by the query functionality.

While documenting the query-translation functionality implemented by this module is still work in progress, the (user-configurable) Property Graph to RDF-star mapping that defines the RDF-star views is described in the next section, followed by a description of the two command-line programs.

User-Configurable Mapping of Property Graphs to RDF-star Graphs

The SPARQL-star based query functionality provided by this module considers specific types of RDF-star views of an underlying Property Graph. These views are defined based on a user-configurable mapping from the Property Graph model to RDF-star.

The idea of this mapping is simple: Given a Property Graph, every edge of this graph (including the label of the edge) is represented as an ordinary RDF triple in the resulting RDF-star data. The same holds for each node with its label, as well as for every node property. Edge properties are represented as nested triples that contain, as their subject, the triple representing the corresponding edge (more details and examples below).

While the structure of the resulting RDF-star graphs cannot be influenced, the mapping is configurable in terms of the particular elements (blank nodes and IRIs) of the triples that it generates. Formally, this form of configuration is captured by a notion that we call LPG-to-RDF-star configuration. The concrete LPG-to-RDF-star configuration to be used when applying the mapping may, in principle, be specified in various forms. The form that the HeFQUIN-PGConnector expects is an RDF-based description using a specific RDF vocabulary that we have developed for this purpose and that provides a lot of different options for the different components of LPG-to-RDF-star configurations.

The following sections describe the mapping approach in more detail and, at the same time, introduce the components of LPG-to-RDF-star configurations, together with some of the possible options for each of these components. For a complete definition of all the options, refer to the file that defines the vocabulary, and a formal definition of the whole mapping approach can be found in the following research paper:

Node Mapping

The first component of every LPG-to-RDF-star configuration is a so-called node mapping which specifies whether the nodes of the Property Graph are mapped to blank nodes or to IRIs and, in the case of IRIs, what these IRIs look like.

If you want to use blank nodes to capture the nodes of your Property Graph in the resulting RDF-star view, then you have to use a node mapping of type lr:BNodeBasedNodeMapping in the RDF-based description of your LPG-to-RDF-star configuration. Hence, the relevant part of this description may look as follows (presented in RDF Turtle format, prefix declarations omitted).

_:c1  rdf:type  lr:LPGtoRDFConfiguration ;
      lr:nodeMapping [ rdf:type lr:BNodeBasedNodeMapping ] .

As an alternative to blank nodes, the nodes of the Property Graph may be mapped to IRIs. The specific type of node mapping that HeFQUIN-PGConnector supports for this case is lr:IRIPrefixBasedNodeMapping which, for each node of the Property Graph, creates an IRI by appending the ID of the node to a common IRI prefix. The IRI prefix to be used can be specified via the lr:prefixOfIRIs property.

Example: Assume a Property Graph with two nodes which have the IDs 153 and 295, respectively. By using an LPG-to-RDF-star configuration with a node mapping as specified in the following description, these two Property Graph nodes are mapped to the IRIs http://example.org/node/153 and http://example.org/node/295, respectively.

_:c2  rdf:type  lr:LPGtoRDFConfiguration ;
      lr:nodeMapping [ rdf:type lr:IRIPrefixBasedNodeMapping ;
                       lr:prefixOfIRIs "http://example.org/node/"^^xsd:anyURI ] .

Node Label Mapping and Label Predicate

The mapping approach captures the label of every Property Graph node by an RDF triple whose subject is the blank node or the IRI that the node mapping (see above) associated with that node. The object of such a triple is determined by a so-called node label mapping, which is the second component of an LPG-to-RDF-star configuration, and the predicate is given as the third component of LPG-to-RDF-star configurations. In the RDF-based descriptions of LPG-to-RDF-star configurations, this predicate is specified via the lr:labelPredicate property; whereas, for specifying the node label mapping, there are different options (see all the sub-classes of lr:NodeLabelMapping in our RDF vocabulary). One of these options, as illustrated in the following example, is to create IRIs by appending the node label to a common IRI prefix again (i.e., similar to the IRI prefix option for node mappings that we mentioned in the previous section).

Example: Consider again the two Property Graph nodes of the previous example and assume that the node with ID 153 has the label "Person" whereas the node with ID 295 is labeled "Book". By using an LPG-to-RDF-star configuration with a node label mapping (and label predicate) as specified in the following description, we obtain the two label-related RDF triples (http://example.org/node/153, rdf:type, http://example.org/type/Person) and (http://example.org/node/295, rdf:type, http://example.org/type/Book).

_:c2  rdf:type  lr:LPGtoRDFConfiguration ;
      lr:nodeMapping [
               rdf:type lr:IRIPrefixBasedNodeMapping ;
               lr:prefixOfIRIs "http://example.org/node/"^^xsd:anyURI ] ;
      lr:nodeLabelMapping [
               rdf:type lr:IRIPrefixBasedNodeLabelMapping ;
               lr:prefixOfIRIs "http://example.org/type/"^^xsd:anyURI ] ;
      lr:labelPredicate "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"^^xsd:anyURI .

While the IRIs for the node labels in the previous example are created in a generic manner, it is also possible to map (some of the) node labels to IRIs defined by existing RDF vocabularies.

Example: Consider the following LPG-to-RDF-star configuration.

_:c3  rdf:type  lr:LPGtoRDFConfiguration ;
      lr:nodeMapping [
               rdf:type lr:IRIPrefixBasedNodeMapping ;
               lr:prefixOfIRIs "http://example.org/node/"^^xsd:anyURI ] ;
      lr:nodeLabelMapping [
               rdf:type lr:CompositeNodeLabelMapping ;
               lr:componentMappings ( _:nlm1  _:nlm2  _:nlm3 ) ] ;
      lr:labelPredicate "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"^^xsd:anyURI .

_:nlm1  rdf:type  lr:SingletonIRINodeLabelMapping ;
        lr:label  "Person" ;
        lr:iri "http://xmlns.com/foaf/0.1/Person"^^xsd:anyURI .

_:nlm2  rdf:type  lr:SingletonIRINodeLabelMapping ;
        lr:label  "Book" ;
        lr:iri "https://schema.org/Book"^^xsd:anyURI .

_:nlm3  rdf:type  lr:IRIPrefixBasedNodeLabelMapping ;
        lr:prefixOfIRIs "http://example.org/type/"^^xsd:anyURI .

By using this LPG-to-RDF-star configuration to map the two Property Graph nodes of the previous example, the two label-related RDF triples that we obtain are different: (http://example.org/node/153, rdf:type, foaf:Person) and (http://example.org/node/295, rdf:type, schema:Book). Notice that the node label mapping of the LPG-to-RDF-star configuration is of type lr:CompositeNodeLabelMapping, which means that it is composed of multiple other node label mappings (which are listed via the lr:componentMappings property). In this particular case, it is composed of three other node label mappings, where the first one focuses only on the node label "Person" and maps this particular node label to the Person class of the FOAF vocabulary. Similarly, the second one maps (only) the node label "Book" to the Book class of the Schema.org vocabulary (and the third one covers all other node labels in the same generic way as in the previous example).

It is also possible to extend an lr:IRIPrefixBasedNodeLabelMapping with a regular expression (see lr:RegexBasedNodeLabelMapping) such that only the node labels that match this regular expression are mapped according to the node label mapping. This way, different groups of node labels may be mapped to IRIs with different prefixes.

As a last example, we illustrate that node labels may also be mapped to string literals (instead of IRIs), which can be achieved by using a node label mapping of type lr:LiteralBasedNodeLabelMapping (or lr:SingletonLiteralNodeLabelMapping, but that is not covered in our example).

Example: By using the following LPG-to-RDF-star configuration, we obtain the two label-related RDF triples (http://example.org/node/153, rdfs:label, "Person") and (http://example.org/node/295, rdfs:label, "Book") for our running example (notice that we also use a different label predicate here).

_:c4  rdf:type  lr:LPGtoRDFConfiguration ;
      lr:nodeMapping [
               rdf:type lr:IRIPrefixBasedNodeMapping ;
               lr:prefixOfIRIs "http://example.org/node/"^^xsd:anyURI ] ;
      lr:nodeLabelMapping [
               rdf:type lr:LiteralBasedNodeLabelMapping `] ;
      lr:labelPredicate "http://www.w3.org/2000/01/rdf-schema#label"^^xsd:anyURI .

Edge Label Mapping

Every edge of the given Property Graph is mapped to an RDF triple whose subject (resp. object) is the blank node or the IRI that the node mapping assigns to the source node (resp. target node) of the edge, and the predicate is determined based on the label of the edge. As for the latter, the fourth component of LPG-to-RDF-star configurations is a so-called edge label mapping which maps any edge label to an IRI. The types of edge label mappings supported by HeFQUIN-PGConnector are similar to the types of node label mappings and are captured as sub-classes of lr:EdgeLabelMapping in our RDF vocabulary.

Example: Assume the Property Graph considered in the previous examples contains an edge from the book node (i.e., the node with ID 295) to the person node (with ID 153) and this edge has the label "fundedBy" to indicate that the book was funded by the person. Given the following LPG-to-RDF-star configuration, this edge is mapped to the triple (http://example.org/node/295, schema:funder, http://example.org/node/153).

_:c3  rdf:type  lr:LPGtoRDFConfiguration ;
      lr:nodeMapping [
               rdf:type lr:IRIPrefixBasedNodeMapping ;
               lr:prefixOfIRIs "http://example.org/node/"^^xsd:anyURI ] ;
     #lr:nodeLabelMapping ...
     #lr:labelPredicate ...
      lr:edgeLabelMapping [
               rdf:type lr:CompositeEdgeLabelMapping ;
               lr:componentMappings ( _:elm1  _:elm2  ) ] .

_:elm1  rdf:type  lr:SingletonIRIEdgeLabelMapping ;
        lr:label  "fundedBy" ;
        lr:iri "https://schema.org/funder"^^xsd:anyURI .

_:elm2  rdf:type  lr:IRIPrefixBasedEdgeLabelMapping ;
        lr:prefixOfIRIs "http://example.org/relationship/"^^xsd:anyURI .

We emphasize that the mapping approach is not information preserving for Property Graphs that contain multiple edges between the same nodes with the same label. That is, by the mapping approach, all edges with the same source node, the same target node, and the same label, collapse into a single triple in the RDF-star view.

Property Name Mapping

The fifth component of LPG-to-RDF-star configurations is a so-called property name mapping which is used to determine the predicate of every triple that captures a property of a node or an edge of the mapped Property Graph.

Example: Assume that the person node in our example Property Graph has a property named "name" with the value "Bob", and the "fundedBy" edge has a property named "amount" with the value 3000. By using the following LPG-to-RDF-star configuration, for which we now specify a property name mapping, the property of the person node is mapped to the triple (http://example.org/node/153, foaf:name, "Bob") and the property of the "fundedBy" edge is mapped to the (nested!) triple ( (http://example.org/node/295, schema:funder, http://example.org/node/153), http://example.org/p/amount, 3000 ).

_:c3  rdf:type  lr:LPGtoRDFConfiguration ;
      lr:nodeMapping [
               rdf:type lr:IRIPrefixBasedNodeMapping ;
               lr:prefixOfIRIs "http://example.org/node/"^^xsd:anyURI ] ;
     #lr:nodeLabelMapping ...
     #lr:labelPredicate ...
      lr:edgeLabelMapping [
               rdf:type lr:CompositeEdgeLabelMapping ;
               lr:componentMappings ( _:elm1  _:elm2  ) ] ;
      lr:propertyNameMapping [
               rdf:type lr:CompositePropertyNameMapping ;
               lr:componentMappings ( _:pm1  _:pm2  ) ] .

_:elm1  rdf:type  lr:SingletonIRIEdgeLabelMapping ;
        lr:label  "fundedBy" ;
        lr:iri "https://schema.org/funder"^^xsd:anyURI .

_:elm2  rdf:type  lr:IRIPrefixBasedEdgeLabelMapping ;
        lr:prefixOfIRIs "http://example.org/relationship/"^^xsd:anyURI .

_:pm1  rdf:type  lr:SingletonIRIPropertyNameMapping ;
       lr:label  "name" ;
       lr:iri "http://xmlns.com/foaf/0.1/name"^^xsd:anyURI .

_:pm2  rdf:type  lr:IRIPrefixBasedPropertyNameMapping ;
       lr:prefixOfIRIs "http://example.org/p/"^^xsd:anyURI .

Command-Line Programs

Compile the Programs

To use the command-line programs you first need to compile them using the Java source code in this repository.

To this end, first create a local clone of this repository on your computer, which you can do by opening a terminal, navigate to the directory into which you want to clone the repository, and then either clone the repository with its SSH URL by running the following command

git clone [email protected]:LiUSemWeb/HeFQUIN-PGConnector.git

or clone the repository with its HTTPS URL by running the following command

git clone https://github.com/LiUSemWeb/HeFQUIN-PGConnector.git

Both of these two commands should give you a sub-directory called HeFQUIN-PGConnector. Enter this directory and use Maven to compile the Java source code by running the following command

mvn package

Assuming the Maven process completes successfully, you can use the command-line programs provided in this repository as described in the following sections.

Run the RunBGPOverNeo4j Program

RunBGPOverNeo4j provides the query functionality of HeFQUIN-PGConnector as a standalone tool. Since RunBGPOverNeo4j is a Java program, it needs to be run using the Java interpreter (after compiling it as described above). The default way to do so is by running the following command,

java -cp target/HeFQUIN-PGConnector-0.0.1-SNAPSHOT.jar se.liu.ida.hefquin.cli.RunBGPOverNeo4j --query=query.rq --lpg2rdfconf=LPG2RDF.ttl --endpoint=http://localhost:7474/db/neo4j/tx/commit

where the --query argument is used to refer to a file that contains the SPARQL-star query you want execute, the --lpg2rdfconf argument refers to an RDF document that provides an RDF-based description of the LPG-to-RDF-star configuration that you want to use for the RDF-star view of your Property Graph (an example of an RDF Turtle file with such a description is provided as part of this repository and, thus, available in your local clone of the repository---see: ExampleLPG2RDF.ttl), and the --endpoint argument specifies the HTTP address at which your Neo4j instance responds to Cypher queries over your Property Graph.

Further arguments can be used. To see a list of all arguments supported by the program, run the program with the --help argument:

java -cp target/HeFQUIN-PGConnector-0.0.1-SNAPSHOT.jar se.liu.ida.hefquin.cli.RunBGPOverNeo4j --help

Run the MaterializeRDFViewOfLPG Program

MaterializeRDFViewOfLPG can be used to convert any Property Graph into an RDF-star graph by applying our user-configurable Property Graph to RDF-star mapping. As a Java program, it needs to be run using the Java interpreter (after compiling it as described above). The default way to do so is by running the following command,

java -cp target/HeFQUIN-PGConnector-0.0.1-SNAPSHOT.jar se.liu.ida.hefquin.cli.MaterializeRDFViewOfLPG --lpg2rdfconf=LPG2RDF.ttl --endpoint=http://localhost:7474/db/neo4j/tx/commit

where the --lpg2rdfconf argument refers to an RDF document that provides an RDF-based description of the LPG-to-RDF-star configuration that you want to use for the RDF-star view of your Property Graph (an example of an RDF Turtle file with such a description is provided as part of this repository and, thus, available in your local clone of the repository---see: ExampleLPG2RDF.ttl) and the --endpoint argument specifies the HTTP address at which your Neo4j instance responds to Cypher queries over your Property Graph.

Further arguments can be used. To see a list of all arguments supported by the program, run the program with the --help argument:

java -cp target/HeFQUIN-PGConnector-0.0.1-SNAPSHOT.jar se.liu.ida.hefquin.cli.MaterializeRDFViewOfLPG --help

About

Module of the HeFQUIN query federation engine to connect to Property Graph data sources.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published