Skip to content

Using template variables to construct new values

Tim L edited this page May 29, 2014 · 86 revisions
csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)

See conversion:Enhancement.

What variables can be used in templates?

Global variables:

  • [/] - base_uri/
  • [/s] - base_uri/source/source_identifier
  • [/sd] - base_uri/source/source_identifier/dataset/dataset_identifier
  • [/sdv] - base_uri/source/source_identifier/dataset/dataset_identifier/version/version_identifier
  • [v] - the dataset's version_identifier
  • [e] - the dataset's enhancement_identifier
  • [D] - the dataset's subject_discriminator
  • [/sD] - base_uri/source/source_identifier/dataset/dataset_identifier[/subjectDiscriminator]
  • [/sDv] - base_uri/source/source_identifier/dataset/dataset_identifier[/subjectDiscriminator]/version/dataset_version
  • [uuid] - provide a UUID (Note, this should only be used in extreme cases; try to construct your URIs from the data itself so that reconversion will produce the same URIs)

Contextual:

  • [@] - the local name of this property
  • [r] - the row of this value
  • [c] - the column of this value
  • [.] - the value of the cell; empty value proceeds without special processing or omission. (implemented as [+])
  • [+] - the value of the cell; if empty, provide unique non-empty value. (only partially implemented)
  • [!] - the value of the cell; if empty, omit any triple using this template variable.
  • [#N] - the value of this row at column N.
    • [#N] - if empty, behaves like [+] by providing a unique non-empty value.
    • [#N/] - if empty, behaves like [.] by proceeding without special processing or omission.
  • [@PROPERTY_NAME] - the value of this row at column whose output property will be PROPERTY_NAME (undefined when multiple columns are consolidated to a single predicate).

Operators:

  • ^ - upper case a value; e.g. [^.^] and [^#1^]
  • _ - lower case a value; e.g. [_._] and [_#1_]
  • [^.-] - capitalize first letter of value; leave rest the same.
  • [^._] - e.g. "HARTFORD HOSPITAL" into "Hartford Hospital"`
  • >< - apply replaceAll("[^a-zA-Z_0-9\\-]","_") and while(gsub(/__/,"_")) (Note: this is [done by default](On Identity) when constructing URIs and will be implemented if we need to relax that assumption). This was added for literals to help conversion:object_search, but only trims spaces.
  • xsd:decimal([#4])
  • e.g. "[/]id/url/md5/md5([#1])"
  • e.g. "http://lod.hackerceo.org/VIVO2DOI/bundle/increment([#1])"
  • e.g. "domain([#1])

Regex Contextual:

Partially implemented:

  • [H] - the original header
  • [L] - the conversion:label of this property

Contextual (not implemented yet):

  • [D] - domain of this property
  • [R] - range of this property

Experimental:

  • [#H+1] - the value of the cell one below the header of the current column (i.e., [c])
  • [#H+2] - the value of the cell two below the header of the current column (i.e., [c])

Default templates

(informative)

  • [/sdv]thing_[r] - the URI for the row
  • [/sd]value-of/[@]/[.] - the URI for a cell predicate-scoped promoted
  • [/sd]typed/[R]/[.] - the URI for a cell type-promoted

What datasets use Templates?

Datasets in http://logd.tw.rpi.edu/sparql that use templates (results):

prefix conversion: <http://purl.org/twc/vocab/conversion/>
prefix ov:         <http://open.vocab.org/terms/>
    
select ?p ?o count(?o) as ?count
where {
  graph <http://purl.org/twc/vocab/conversion/ConversionProcess>  {
    ?s ov:csvCol ?col; ?p ?o .
    filter (?p != (conversion:label))           # templates to name predicate are not recognized.
    filter (?p != (conversion:comment))         # templates in predicate comments are not recognized.
    filter (?p != (conversion:delimits_object)) # delimits_object specifies a pattern, not template.
    filter (?p != (conversion:key_template))    # key_template is DEPRECATED; replaced by domain_template.
    filter regex(?p, "^http://purl.org/twc/vocab/conversion/.*")
    filter regex(?o, ".*\\[.*\\]") # NOTE: This string is not correctly rendered.
  }
} group by ?p ?o order by ?p ?o desc(?count) 

conversion: predicates that accept templates

See also

See also Patterns versus Templates.

Java implementation

The comments on edu.rpi.tw.data.csv.impl.CSVRecordTemplateFiller list the template variables and their behavior.

The following methods implement the template filling:

  • edu.rpi.tw.data.csv.impl.DefaultEnrichmentParameters#fillTemplate fills the namespace-type variables [/] etc.
  • edu.rpi.tw.data.csv.impl.CSVRecordTemplateFiller#fillTemplate(String) fills the row-contextual variables [@p], [#3], [r], [c], etc.
  • edu.rpi.tw.data.csv.valuehandlers.LiteralCodebookValueHandler fills the regex capture groups on its own.

Issues

https://github.com/timrdf/csv2rdf4lod-automation/issues/issue/13

Clone this wiki locally