Option for fatal error when links_via can't locate a match (for validation) #369

ewpatton · 2013-09-04T13:19:10Z

As we collect and convert bloodwork data into RDF we are aggregating labels since each lab facility uses different labels. For example, one facility provides "neu#", another provides "Neu # (ANC)", and yet another provides "ne #r". To a physician the mappings are obvious but to a machine not so much. It would be a nice feature to have CSV2RDF4LOD stop conversion when it fails to find a match because we're looking to have as complete coverage as possible of the underlying data. Currently, we work around it by scanning for lines where the property for the column appears but not multiple values:

$ find * -name '*.e1.ttl' -exec grep -H ofCharacteristic {} \; | grep -v ,
2013-08-20/automatic/cbc_ruby.csv.e1.ttl:   health:ofCharacteristic value_of_characteristic:Neu_ANC ;
2013-08-27/automatic/250_comprehensive_panel.csv.e1.ttl:    health:ofCharacteristic value_of_characteristic:Bilirubin_Total ;

Once the failures have been identified we can then add the missing labels to the ontology, wipe the version, and reconvert. However, on large datasets this rinse-and-repeat procedure would be cumbersome as the conversion might take significant time and we'd like to know about failure early in the process.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option for fatal error when links_via can't locate a match (for validation) #369

Option for fatal error when links_via can't locate a match (for validation) #369

ewpatton commented Sep 4, 2013

Option for fatal error when links_via can't locate a match (for validation) #369

Option for fatal error when links_via can't locate a match (for validation) #369

Comments

ewpatton commented Sep 4, 2013