Skip to content

Latest commit

 

History

History
531 lines (368 loc) · 18.3 KB

README.md

File metadata and controls

531 lines (368 loc) · 18.3 KB

Wikidata JSKOS

GitHub release API Status License Docker Test npm version standard-readme compliant

Access Wikidata in JSKOS format

This node module provides a web service, a command line client, and a library to access Wikidata in JSKOS format. The data includes Wikidata items as concepts and concept schemes (read) and mappings between Wikidata and other authority files (read and write).

Table of Contents

Background

Wikidata is a large knowledge base with detailed information about all kinds of entities. Mapping its data model to JSKOS data format allows simplified reuse of Wikidata as authority file. This implementation is used in the Cocoda web application but it can also be used independently.

The mapping between Wikidata and JSKOS format includes:

In addition a search service is provided for selecting a Wikidata item with typeahead.

Editing Wikidata mapping statements to other authority files requires authentification via OAuth. The following authority files have been tested succesfully:

  • Basisklassifikation (BK)
  • Regensburg Classification (RVK)
  • Integrated Authority File (GND)
  • Nomisma
  • Iconclass

Other systems (not including DDC) may also work but they have not been converted to JSKOS yet, so they are not provided for browsing in Cocoda.

Install

Docker

The easiest way to run wikidata-jskos is via Docker. Please refer to the Docker documentation.

Node.js

Node.js 18 is required (Node.js 20 recommended).

git clone https://github.com/gbv/wikidata-jskos.git
cd wikidata-jskos
npm ci

Optionally make the command line tool wdjskos available:

npm link

Configuration

You can customize the application settings via a configuration file, e.g. by providing a generic config.json file and/or a more specific config.{env}.json file (where {env} is the environment like development or production). The latter will have precendent over the former, and all missing keys will be defaulted with values from config.default.json.

All configuration options can also be set via environment variables (.env when running via Node.js or using environment or env_file in Docker Compose).

Some notes:

  • To use a custom Wikibase instance, you can set the subkeys of the wikibase property. Both instance and sparqlEnpoint are necessary. By default, Wikidata is used.
  • wikidata-jskos supports saving, editing, and deleting mappings in Wikidata. To enable this, you will need to provide auth.algorithm and auth.key (algorithm and key to decode the JWT, usually coming from login-server), as well as oauth.consumer_key and oauth.consumer_secret (for your registered OAuth consumer).
  • auth.key/AUTH_KEY contain line breaks. In JSON, these can simply be set as \n. When using .env or env_file, the whole key needs to be double-quoted ("-----BEGIN PUBLIC KEY-----\n..."). To set AUTH_KEY directly in docker-compose.yml via environment, please look at the included docker-compose.yml file or refer to this StackOverflow answer.
  • Please provide a baseUrl when used in production. If no baseUrl is provided, http://localhost:${port}/ will be used.
  • List of all available configuration options:
config.json key environment variable default value
title TITLE Wikidata JSKOS Service
wikibase.instance WIKIBASE_INSTANCE https://www.wikidata.org
wikibase.sparqlEndpoint WIKIBASE_SPARQL https://query.wikidata.org/sparql
wikibase.api WIKIBASE_API ${wikibase.instance}/w/api.php
verbosity VERBOSITY false
baseUrl BASE_URL http://localhost:${port}/
port PORT 2013
auth.algorithm AUTH_ALGORITHM HS256
auth.key AUTH_KEY null
oauth.consumer_key OAUTH_KEY null
oauth.consumer_secret OAUTH_SECRET null

Usage

Run Server

For development serve with hot reload and auto reconnect at http://localhost:2013/:

npm run start

Deployment

For deployment there is a config file to use with pm2:

cp ecosystem.example.json ecosystem.config.json
pm2 start ecosystem.config.json

To update concept schemes, regularly run:

npm run update

Web Service

An instance is available at https://coli-conc.gbv.de/services/wikidata/. The service provides selected endpoints of JSKOS API.

Authentication

The following endpoints require an authenticated user:

Authentication works via a JWT (JSON Web Token). The JWT has to be provided as a Bearer token in the authentication header, e.g. Authentication: Bearer <token>. It is integrated with login-server and the JWT is required to have the same format as the one login-server provides. Specifically, the OAuth token and secret for the user need to be provided as follows:

{
  "user": {
    "identities": {
      "wikidata": {
        "oauth": {
          "token": "..",
          "token_secret": "..."
        }
      }
    }
  }
}

There are more properties in the JWT, but those are not used by wikidata-jskos. Note that the JWT needs to be signed with the respective private key for the public key provided in the configuration. Also, the OAuth user token and secret need to come from the same OAuth consumer provided in the config.

GET /status

Returns a JSKOS API status object. See JSKOS Server for details.

GET /concepts

Look up Wikidata items as JSKOS Concepts by their entity URI or QID.

  • URL Params

    uri=[uri] URIs for concepts separated by |.

    language or languages: comma separated list of language codes.

  • Success Response

    JSON array of JSKOS Concepts

Only some Wikidata properties are mapped to JSKOS fields. The result also contains broader links determined by an additional SPARQL request.

Deprected alias at /concept is going to be removed.

GET /concepts/suggest

OpenSearch Suggest endpoint for typeahead search.

Deprected aliases at /concept/suggest and /suggest are going to be removed.

GET /mappings

Look up Wikidata mapping statements as JSKOS Concept Mappings between Wikidata items (query parameter from) and external identifiers (query parameter to). At least one of both parameters must be given.

  • URL Params

    from=[uriOrNotation1|uriOrNotation2|...] specify the source URI or notation (multiple URIs/notations separated by |)

    to=[uriOrNotation1|uriOrNotation2|...] specify the target URI or notation (multiple URIs/notations separated by |)

    fromScheme=[uri|notation] only show mappings from this concept scheme (URI or notation)

    toScheme=[uri|notation] only show mappings to this concept scheme (URI or notation)

    language or languages enables inclusion of entity labels. A comma separated list of language codes is used as preference list.

    mode=[mode] specify the mode for from, to, one of and (default) and or

    direction=forward|backward|both searches mappings from from to to (default), reverse, or in both directions

    limit=[number] maximum number of mappings to return (not fully implemented)

    offset=[number] start number of mappings to return (not fully implemented)

Concept Schemes are identified by BARTOC IDs (e.g. http://bartoc.org/en/node/430`).

  • Success Response

    JSON array of JSKOS Concept Mappings

  • Examples

    ?from=http://www.wikidata.org/entity/Q42

    ?to=http://d-nb.info/gnd/119033364

Mapping relation types (P4390) are respected, if given, see for example mapping from Wikidata to http://d-nb.info/gnd/7527800-5.

GET /mappings/voc

Look up Wikidata items with Wikidata properties for authority control as JSKOS Concept Schemes with used for mappings. These schemes need to have a BARTOC-ID (P2689), and be subject item (P1629) of an external identifier property with statements P1921 (URI template) and P1793 (regular expression).

GET /mappings/:_id

Returns a specific mapping for a Wikidata claim/statement.

  • Success Response

    JSKOS object for mapping.

  • Error Response

    If no claim with _id could be found, it will return a 404 not found error.

  • Sample Call

    curl https://coli-conc.gbv.de/services/wikidata/mappings/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598
    {
      "uri": "http://localhost:2013/mappings/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598",
      "identifier": [
        "http://www.wikidata.org/entity/statement/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598",
        "urn:jskos:mapping:content:2807c55eac85ed8c0c9254ff04b457f89b801ac9",
        "urn:jskos:mapping:members:daafcd8580e6f0304f0b1cee024f65f04da98a3c"
      ],
      "to": {
        "memberSet": [
          {
            "uri": "http://rvk.uni-regensburg.de/nt/VK",
            "notation": [
              "VK"
            ]
          }
        ]
      },
      "type": [
        "http://www.w3.org/2004/02/skos/core#exactMatch"
      ],
      "fromScheme": {
        "uri": "http://bartoc.org/en/node/1940",
        "notation": [
          "WD"
        ]
      },
      "toScheme": {
        "uri": "http://bartoc.org/en/node/533",
        "notation": [
          "RVK"
        ]
      },
      "from": {
        "memberSet": [
          {
            "uri": "http://www.wikidata.org/entity/Q11351",
            "notation": [
              "Q11351"
            ]
          }
        ]
      },
      "@context": "https://gbv.github.io/jskos/context.json"
    }

POST /mappings

Saves a mapping in Wikidata. Requires authentication.

Note that if an existing mapping in Wikidata is found with the exact same members, that mapping will be overwritten by this request.

  • Success Reponse

    JSKOS Mapping object as it was saved in Wikidata.

PUT /mappings/:_id

Overwrites a mapping in Wikidata. Requires authentication.

  • Success Reponse

    JSKOS Mapping object as it was saved in Wikidata.

DELETE /mappings/:_id

Deletes a mapping from Wikidata. Requires authentication.

  • Success Reponse

    Status 204, no content.

Command line tool

The command line client wdjskos provides roughly the same commands as accessible via the web service.

Mapping schemes are cached in the subfolder ./cache. To update the cache include option --force or run command update.

wdjskos concept

Look up Wikidata items as JSKOS Concepts.

wdjskos concept Q42

wdjskos mappings

Look up JSKOS Concept Mappings.

wdjskos mappings Q42 | jq .to.memberSet[].uri
wdjskos mappings - http://viaf.org/viaf/113230702

A single hyphen (-) can be used to nullify argument from or to, respectively. Mappings can be limited to a target scheme. These are equivalent:

wdjskos --scheme P227 mappings Q42
wdjskos --scheme 430 mappings Q42
wdjskos --scheme http://bartoc.org/en/node/430 mappings Q42

wdjskos schemes

Return up JSKOS Concept Schemes with Wikidata properties for authority control.

wdjskos update

Look up concept schemes from Wikidata and update the cache.

wdjskos find

Search a Wikidata item by its names and return OpenSearch Suggestions response.

wdjskos mapping-item

Convert a JSKOS mapping to a Wikidata item.

wdjskos mapping-item mapping.json
wdjskos --simplfiy mapping-item mapping.json

API

The node library can be used to convert Wikidata JSON format to JSKOS (mapEntity) and to convert JSKOS mappings to Wikidata JSON format (mapMapping).

mapEntity

jskos = wds.mapEntity(entity)

Entity data can be retrieved via Wikidata API method wbgetentities and from Wikidata database dumps. See JavaScript libraries wikidata-sdk and wikidata-filter for easy access and processing.

Map selected parts of a Wikidata entity

All methods return a JSKOS item.

jskos = wds.mapIdentifier(entity.id)
// { uri: "http://www.wikidata.org/entity/...", notation: [ "..." ] }

jskos = wds.mapLabels(entity.labels)
// { prefLabel: { ... } }

jskos = wds.mapAliases(entity.aliases)
// { altLabel: { ... } }

jskos = wds.mapDescriptions(entity.descriptions)
// { scopeNote: { ... } }

jskos = wds.mapSitelinks(entity.sitelinks)
// { occurrences: [ { ... } ], subjectOf: [ { url: ... }, ... ] }

jskos = wds.mapClaims(entity.claims)
// ...

// convert claims with mapping properties
jskos = wds.mapMappingClaims(claims)

jskos = wds.mapInfo(entity)
// ...

Map simplified Wikidata entities

Each method has a counterpart to map simplified Wikidata entities.

jskos = wds.mapSimpleEntity(entity)
jskos = wds.mapSimpleIdentifier(entity.id)
jskos = wds.mapSimpleLabels(entity.labels)
...

mapMapping

Convert a JSKOS mapping into a Wikidata claim. Only respects JSKOS fields from, to, uri, and type (if given) and only supports 1-to-1 mappings from a single Wikidata item to a concept in another concept scheme.

this is work in progress!

Maintainers

Publish

For maintainers only

Please work on the dev branch during development (or better yet, develop in a feature branch and merge into dev when ready).

When a new release is ready (i.e. the features are finished, merged into dev, and all tests succeed), run the included release script (replace "patch" with "minor" or "major" if necessary):

npm run release:patch

This will:

  • Check that we are on dev
  • Run tests and build to make sure everything works
  • Make sure dev is up-to-date
  • Run npm version patch (or "minor"/"major")
  • Ask you to confirm the version
  • Push changes to dev
  • Switch to main
  • Merge changes from dev
  • Push main with tags
  • Switch back to dev

After running this, GitHub Actions will automatically create a new GitHub Release draft. Please edit and publish the release manually.

Contributing

PRs accepted against the dev branch.

Small note: If editing the README, please conform to the standard-readme specification.

License

MIT © 2024 Verbundzentrale des GBV (VZG)