Skip to content

rdfjs/rdfxml-streaming-parser.js

Repository files navigation

RDF/XML Streaming Parser

Build status Coverage Status npm version

A fast, streaming RDF/XML parser that outputs RDFJS-compliant quads.

Installation

$ yarn install rdfxml-streaming-parser

This package also works out-of-the-box in browsers via tools such as webpack and browserify.

Require

import {RdfXmlParser} from "rdfxml-streaming-parser";

or

const RdfXmlParser = require("rdfxml-streaming-parser").RdfXmlParser;

Usage

RdfXmlParser is a Node Transform stream that takes in chunks of RDF/XML data, and outputs RDFJS-compliant quads.

It can be used to pipe streams to, or you can write strings into the parser directly.

Print all parsed triples from a file to the console

const myParser = new RdfXmlParser();

fs.createReadStream('myfile.rdf')
  .pipe(myParser)
  .on('data', console.log)
  .on('error', console.error)
  .on('end', () => console.log('All triples were parsed!'));

Manually write strings to the parser

const myParser = new RdfXmlParser();

myParser
  .on('data', console.log)
  .on('error', console.error)
  .on('end', () => console.log('All triples were parsed!'));

myParser.write('<?xml version="1.0"?>');
myParser.write(`<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:ex="http://example.org/stuff/1.0/"
         xml:base="http://example.org/triples/">`);
myParser.write(`<rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar">`);
myParser.write(`<ex:prop />`);
myParser.write(`</rdf:Description>`);
myParser.write(`</rdf:RDF>`);
myParser.end();

Import streams

This parser implements the RDFJS Sink interface, which makes it possible to alternatively parse streams using the import method.

const myParser = new RdfXmlParser();

const myTextStream = fs.createReadStream('myfile.rdf');

myParser.import(myTextStream)
  .on('data', console.log)
  .on('error', console.error)
  .on('end', () => console.log('All triples were parsed!'));

Configuration

Optionally, the following parameters can be set in the RdfXmlParser constructor:

  • dataFactory: A custom RDFJS DataFactory to construct terms and triples. (Default: require('@rdfjs/data-model'))
  • baseIRI: An initial default base IRI. (Default: '')
  • defaultGraph: The default graph for constructing quads. (Default: defaultGraph())
  • strict: If the internal SAX parser should parse XML in strict mode, and error if it is invalid. (Default: false)
  • trackPosition: If the internal position (line, column) should be tracked an emitted in error messages. (Default: false)
  • allowDuplicateRdfIds: By default multiple occurrences of the same rdf:ID value are not allowed. By setting this option to true, this uniqueness check can be disabled. (Default: false)
  • validateUri: By default, the parser validates each URI. (Default: true)
  • iriValidationStrategy: Allows to customize the used IRI validation strategy using the IriValidationStrategy enumeration. IRI validation is handled by validate-iri.js. (Default: IriValidationStrategy.Pragmatic)
new RdfXmlParser({
  dataFactory: require('@rdfjs/data-model'),
  baseIRI: 'http://example.org/',
  defaultGraph: namedNode('http://example.org/graph'),
  strict: true,
  trackPosition: true,
  allowDuplicateRdfIds: true,
  validateUri: true,
});

License

This software is written by Ruben Taelman.

This code is released under the MIT license.