-
Notifications
You must be signed in to change notification settings - Fork 36
Using template variables to construct new values
Tim L edited this page May 29, 2014
·
86 revisions
csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)
Global variables:
-
[/]
- base_uri/
-
[/s]
- base_uri/source/
source_identifier -
[/sd]
- base_uri/source/
source_identifier/dataset/
dataset_identifier -
[/sdv]
- base_uri/source/
source_identifier/dataset/
dataset_identifier/version
/version_identifier -
[v]
- the dataset's version_identifier -
[e]
- the dataset's enhancement_identifier -
[D]
- the dataset's subject_discriminator -
[/sD]
- base_uri/source/
source_identifier/dataset/
dataset_identifier[/
subjectDiscriminator] -
[/sDv]
- base_uri/source/
source_identifier/dataset/
dataset_identifier[/
subjectDiscriminator]/version/
dataset_version -
[uuid]
- provide a UUID (Note, this should only be used in extreme cases; try to construct your URIs from the data itself so that reconversion will produce the same URIs)
Contextual:
-
[@]
- the local name of this property -
[r]
- the row of this value -
[c]
- the column of this value -
[.]
- the value of the cell; empty value proceeds without special processing or omission. (implemented as[+]
) -
[+]
- the value of the cell; if empty, provide unique non-empty value. (only partially implemented) -
[!]
- the value of the cell; if empty, omit any triple using this template variable. -
[#N]
- the value of this row at column N.-
[#N]
- if empty, behaves like[+]
by providing a unique non-empty value. -
[#N/]
- if empty, behaves like[.]
by proceeding without special processing or omission.
-
-
[@PROPERTY_NAME]
- the value of this row at column whose output property will be PROPERTY_NAME (undefined when multiple columns are consolidated to a single predicate).
Operators:
-
^
- upper case a value; e.g.[^.^]
and[^#1^]
-
_
- lower case a value; e.g.[_._]
and[_#1_]
-
[^.-]
- capitalize first letter of value; leave rest the same. -
[^._]
- e.g. "HARTFORD HOSPITAL" into "Hartford Hospital"` -
><
- applyreplaceAll("[^a-zA-Z_0-9\\-]","_")
andwhile(gsub(/__/,"_"))
(Note: this is [done by default](On Identity) when constructing URIs and will be implemented if we need to relax that assumption). This was added for literals to help conversion:object_search, but only trims spaces. xsd:decimal([#4])
- e.g.
"[/]id/url/md5/md5([#1])"
- e.g.
"http://lod.hackerceo.org/VIVO2DOI/bundle/increment([#1])"
- e.g.
"domain([#1])
Regex Contextual:
-
[\\1]
- first capture group of the conversion:regex on an conversion:object_search.- (Note: it is actually
[\1]
, but slashes need to be escaped in Turtle.)
- (Note: it is actually
Partially implemented:
-
[H]
- the original header -
[L]
- the conversion:label of this property
Contextual (not implemented yet):
-
[D]
- domain of this property -
[R]
- range of this property
Experimental:
-
[#H+1]
- the value of the cell one below the header of the current column (i.e.,[c]
) -
[#H+2]
- the value of the cell two below the header of the current column (i.e.,[c]
)
(informative)
-
[/sdv]thing_[r]
- the URI for the row -
[/sd]value-of/[@]/[.]
- the URI for a cell predicate-scoped promoted -
[/sd]typed/[R]/[.]
- the URI for a cell type-promoted
Datasets in http://logd.tw.rpi.edu/sparql that use templates (results):
prefix conversion: <http://purl.org/twc/vocab/conversion/>
prefix ov: <http://open.vocab.org/terms/>
select ?p ?o count(?o) as ?count
where {
graph <http://purl.org/twc/vocab/conversion/ConversionProcess> {
?s ov:csvCol ?col; ?p ?o .
filter (?p != (conversion:label)) # templates to name predicate are not recognized.
filter (?p != (conversion:comment)) # templates in predicate comments are not recognized.
filter (?p != (conversion:delimits_object)) # delimits_object specifies a pattern, not template.
filter (?p != (conversion:key_template)) # key_template is DEPRECATED; replaced by domain_template.
filter regex(?p, "^http://purl.org/twc/vocab/conversion/.*")
filter regex(?o, ".*\\[.*\\]") # NOTE: This string is not correctly rendered.
}
} group by ?p ?o order by ?p ?o desc(?count)
- conversion:domain_template (will become conversion:subject_template)
- conversion:range_template (will become conversion:object_template)
See also Patterns versus Templates.
The comments on edu.rpi.tw.data.csv.impl.CSVRecordTemplateFiller
list the template variables and their behavior.
The following methods implement the template filling:
-
edu.rpi.tw.data.csv.impl.DefaultEnrichmentParameters#fillTemplate
fills the namespace-type variables[/]
etc. -
edu.rpi.tw.data.csv.impl.CSVRecordTemplateFiller#fillTemplate(String)
fills the row-contextual variables[@p]
,[#3]
,[r]
,[c]
, etc. -
edu.rpi.tw.data.csv.valuehandlers.LiteralCodebookValueHandler
fills the regex capture groups on its own.
https://github.com/timrdf/csv2rdf4lod-automation/issues/issue/13