-
Notifications
You must be signed in to change notification settings - Fork 36
conversion:Repeat_previous_if_empty_column
timrdf edited this page Jul 6, 2012
·
21 revisions
csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)
Structural conversion:Enhancements:
- conversion:charset - to specify the character encoding of the input file.
- conversion:HeaderRow - to specify the row that contains header data (or [dimensional values](Converting with cell based subjects)).
-
conversion:DataStartRow - to specify the first (inclusive) row that contains data.
- conversion:delimits_cell - to specify the character that terminates a cell.
- conversion:Only_if_column - to omit processing a row if a certain column's value is missing.
- conversion:Repeat_previous_if_empty_column - to "downfill" an empty cell with the value from above.
- conversion:repeat_previous - to specify a value that indicates repetition (instead of just an empty value).
- conversion:Omitted - to specify a column to omit.
- conversion:DataEndRow - to specify the last (inclusive) row that contains data.
Some abbreviations are used in CSVs that are authored by humans in Excel. One such abbreviation is to leave empty cells when the value repeats. These implicit values can be filled in by using the RepeatPreviousIfEmptyEnhancement. If, after passing the onlyIfCol test, a value is not present in a CSV row, use the value from the previous row.
e.g., Dataset 1623
@prefix conversion: <http://purl.org/twc/vocab/conversion/> .
@prefix : <http://logd.tw.rpi.edu/source/data-gov/dataset/1623/version/2009-May-18/params/enhancement/1/> .
:dataset a void:Dataset;
conversion:base_uri "http://logd.tw.rpi.edu"^^xsd:anyURI;
conversion:source_identifier "data-gov";
conversion:dataset_identifier "1623";
conversion:dataset_version "2009-May-18";
conversion:conversion_process [
a conversion:RawConversionProcess;
conversion:enhancement_identifier "1";
conversion:enhance [
ov:csvRow 7;
a conversion:HeaderRow;
];
conversion:enhance [
ov:csvCol 1;
conversion:label "Region";
conversion:range rdfs:Resource;
a conversion:Repeat_previous_if_empty_column;
a conversion:TypedResourcePromotion;
conversion:range_name "Region";
];
Other datasets that benefit from this structural parameter include Dataset 311, and Dataset 10030.
PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
SELECT distinct ?dataset
WHERE {
GRAPH <http://logd.tw.rpi.edu/vocab/Dataset> {
?dataset a void:Dataset;
conversion:conversion_process [
conversion:enhance [
a conversion:Repeat_previous_if_empty_column;
];
]
.
}
}