Skip to content

๐Ÿ‹ Executes a query set against a given SPARQL endpoint

License

Notifications You must be signed in to change notification settings

comunica/sparql-benchmark-runner.js

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

79 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

SPARQL Benchmark Runner

Build Coverage NPM Docker

This is a simple tool to run a query set against a given SPARQL endpoint, and measure its execution time.

Concretely, the query set is a directory containing any number of files, where each file contains a number of SPARQL queries seperated by empty lines.

Example directory of a query set:

watdiv-10M/
  C1.txt
  C2.txt
  C3.txt
  F1.txt
  ...

Example contents of C1.txt:

SELECT * WHERE {
  ?v0 <http://schema.org/caption> ?v1 .
  ?v0 <http://schema.org/text> ?v2 .
}

SELECT * WHERE {
  ?v0 <http://schema.org/caption> ?v1 .
  ?v0 <http://schema.org/text> ?v2 .
}

SELECT * WHERE {
  ?v0 <http://schema.org/caption> ?v1 .
  ?v0 <http://schema.org/text> ?v2 .
}

By default, it generates CSV output in a form similar to:

name;id;error;errorDescription;failures;hash;replication;results;resultsMax;resultsMin;time;timeMax;timeMin;times;timestamps;timestampsMax;timestampsMin;timestampsStd;timeStd
C1;0;false;;0;6e0f167d2eb0e61af0673275ee8f935f;5;5;5;5;25.8;33;20;28 33 26 22 20;25.4 25.4 25.4 25.4 25.4;32 32 32 32 32;20 20 20 20 20;4.176122603564219 4.176122603564219 4.176122603564219 4.176122603564219 4.176122603564219;4.578209256903839
C1;1;false;;0;3e279701df97583c2f296ac0c2e5b877;5;5;5;5;38.6;90;20;27 28 28 20 90;38.4 38.4 38.6 38.6 38.6;89 89 90 90 90;20 20 20 20 20;25.476263462289754 25.476263462289754 25.873538606073968 25.873538606073968 25.873538606073968;25.873538606073968
C1;2;false;;0;4783aeaa4ce9950eafd3a623e1a537f6;5;5;5;5;35.8;80;20;28 26 80 20 25;35.8 35.8 35.8 35.8 35.8;80 80 80 80 80;20 20 20 20 20;22.25668438918969 22.25668438918969 22.25668438918969 22.25668438918969 22.25668438918969;22.25668438918969

Installation

npm install sparql-benchmark-runner

Usage

sparql-benchmark-runner can be used from the CLI with the the following options.

Options:
  --version      Show version number                                   [boolean]
  --endpoint     URL of the SPARQL endpoint to send queries to
                                                             [string] [required]
  --queries      Directory of the queries                    [string] [required]
  --replication  Number of replication runs                [number] [default: 5]
  --warmup       Number of warmup runs                     [number] [default: 1]
  --output       Destination for the output CSV file
                                              [string] [default: "./output.csv"]
  --timeout      Timeout value in seconds to use for individual queries [number]
  ----help

An example input is the following.

sparql-benchmark-runner \
  --endpoint http://example.org/sparql \
  --queries ./watdiv-10M/ \
  --output ./output.csv \
  --replication 5 \
  --warmup 1

When used as a JavaScript library, the runner can be configured with different query loaders, result aggregators and result serializers to accommodate special use cases. By default, when no specific result aggregator is provided, the runner uses ResultAggregatorComunica that handles basic aggregation, as well as the httpRequests metadata field from a Comunica SPARQL endpoint, if such metadata is provided.

import {
  SparqlBenchmarkRunner,
  ResultSerializerCsv,
  ResultAggregatorComunica,
  QueryLoaderFile,
} from 'sparql-benchmark-runner';

async function executeQueries(pathToQueries, pathToOutputCsv) {
  const queryLoader = new QueryLoaderFile(pathToQueries);
  const resultSerializer = new ResultSerializerCsv();
  const resultAggregator = new ResultAggregatorComunica();

  const querySets = await queryLoader.loadQueries();

  const runner = new SparqlBenchmarkRunner({
    endpoint: 'https://localhost:8080/sparql',
    querySets,
    replication: 4,
    warmup: 1,
    timeout: 60_000,
    availabilityCheckTimeout: 1_000,
    logger: (message) => console.log(message),
    resultAggregator,
  });

  const results = await runner.run();

  await resultSerializer.serialize(path, results);
}

Docker

This tool is also available as a Docker image:

touch output.csv
docker run \
  --rm \
  --interactive \
  --tty \
  --volume $(pwd)/output.csv:/output.csv \
  --volume $(pwd)/queries:/queries \
  comunica/sparql-benchmark-runner \
  --endpoint https://dbpedia.org/sparql \
  --queries /queries \
  --output /output.csv \
  --replication 5 \
  --warmup 1

License

This code is copyrighted by Ghent University โ€“ imec and released under the MIT license.