Skip to content

Working with IIIF

Tanmay Singal edited this page Aug 18, 2021 · 8 revisions

Working with IIIF resources

Cineast supports performing extraction on IIIF resources hosted on compatible media servers. During the feature extraction process, Cineast will automatically download all the resources specified in the extraction config file to a folder on the local filesystem and perform extraction on it.

IIIF Extraction Configuration

To extract features from IIIF resources for use with Cineast, an extraction job config JSON file must be used. This file must list out various IIIF job specific parameters along with other general parameters that are required by all types of Cineast extraction jobs (Extractors, Exporters, Database etc). For more information regarding configuration of an extraction job in general, please refer to Extraction Configuration. We will be using a shortened version of an IIIF specific example configuration file to explain what the different parameters mean. The complete config file can be found here.

{
  "input": {
    "path": "path/to/download/directory",
    ...
    "iiif": {
      "imageApiUrl": "...",
      "imageApiVersion": "2.1.1",
      "keepImagesPostExtraction": true,
      "items": [...],
      ],
      "manifestUrl": "...",
      "orderedCollectionUrl": "..."
    }
  },
  "metadata": [
    {
      "name": "IIIFMetaDataExtractor"
    }
  ],
  ...
}

"input" > "path"

The path parameter controls where the IIIF resources will be saved when they are downloaded before extraction begins. This value will only be used if keepImagesPostExecution is set to true. If a path has not been specified, Cineast will store the resources in a directory named iiif-media-${System.currentTimeMillis()}.

"iiif" > "keepImagesPostExtraction"

By default, Cineast will delete the IIIF resources post extraction. If this parameter's value is set to true then instead of deleting these resources, Cineast will save them to the directory specified at "input" > "path".

"iiif" > "region" | "size" | "rotation" | "quality" | "format"

See Configuring an Image API Extraction Job

"iiif" > "manifestUrl"

See Configuring a Presentation API Extraction job

"iiif" > "orderedCollectionUrl"

See Configuring a Change Discovery API Extraction job

"metadata"

The metadata array is used to configure the MetaDataExtractors that will read the metadata files corresponding to every media resource downloaded during an IIIF extraction job. This array should contain a key-value pair as follows:

{
  "name": "IIIFMetaDataExtractor"
}

IIIFMetaDataExtractor reads the .iiif metadata files generated by Cineast when it is downloading your IIIF resources. This metadata is stored to the database you configured and allows vitrivr-ng to fetch IIIF resources directly from their parent IIIF servers without using the local copy that Cineast may or may not have.

Image API

Cineast can download individual image resources in any region, size, rotation, format and quality; provided that these parameters are supported by the server.

Image API versions supported by Cineast:

  • 2.1.1
  • 3.0

Configuring an Image API Extraction job

An Image API job can be configured in the iiif object block of the configuration file. Here is an extract from the example configuration file.

"iiif": {
  "imageApiUrl": "https://libimages.princeton.edu/loris/pudl0001/5138415",
  "imageApiVersion": "2.1.1",
  "keepImagesPostExtraction": true,
  "region": "full",
  "size": "full",
  "rotation": 10,
  "quality": "default",
  "format": "jpg",
  "items": [
    {
      "identifier": "00000010.jp2",
      "region": "square"
    },
    {
      "identifier": "00000011.jp2",
      "rotation": 180,
      "quality": "bitonal",
      "format": "png"
    }
  ],
  "manifestUrl": "https://dms-data.stanford.edu/data/manifests/Parker/bg021sq9590/manifest.json",
  "orderedCollectionUrl": "https://haab-digital.klassik-stiftung.de/viewer/api/v1/records/changes/"
}

The Image API URLs generated by Cineast based on this config are:

"iiif" > "imageApiUrl"

Optional parameter used to set the path of the Image API resource, If a single Image API resource is required then this can contain the complete path pointing to the resource (baseUrl and identifier). If multiple resources have to be downloaded from the same server, then this should contain the common baseUrl shared by all the resources. More info

"iiif" > "imageApiVersion"

Optional parameter used to specify the version of the Image API supported by the server. If the ImageApiVersion is not specified then Cineast will automatically try to determine the highest version of the API supported by the server. Thus, it is okay to omit this if the version is not known or if the resources are available at varying API levels.

"iiif" > "region"

Optional parameter to specify the region of the Image API resources to be downloaded. This value can be overridden on a per-item basis in the iiif > items block. Values accepted as a region parameter are as follows:

Value Description
full default The full image is returned, without any cropping.
square The region is defined as an area where the width and height are both equal to the length of the shorter dimension of the full image. The region may be positioned anywhere in the longer dimension of the full image at the server’s discretion, and centered is often a reasonable default.

"iiif" > "size"

Optional parameter to specify the size of the Image API resources to be downloaded. This value can be overridden on a per-item basis in the iiif > items block. Values accepted as a size parameter are as follows:

Value Description
full default for Image API < 3.0 The image or region is not scaled, and is returned at its full size.
max default for Image API > 3.0 The extracted region is returned at the maximum size available, but will not be upscaled.

"iiif" > "rotation"

Optional parameter that specifies rotation to be applied to the Image API resources before downloading. The numerical value represents the number of degrees of clockwise rotation, and may be any floating point number from 0 to 360. The default rotation value is 0 degrees. This value can be overridden on a per-item basis in the iiif > items block.

"iiif" > "quality"

Optional parameter to specify the quality of the Image API resources to be downloaded. This value can be overridden on a per-item basis in the iiif > items block. Values accepted as a quality parameter are as follows:

Value Description
color The image is returned with all of its color information.
gray The image is returned in grayscale, where each pixel is black, white or any shade of gray in between.
bitonal The image returned is bitonal, where each pixel is either black or white.
default default The image is returned using the server’s default quality (e.g. color, gray or bitonal) for the image.

"iiif" > "format"

Optional parameter to specify the format of the Image API resources to be downloaded. This value can be overridden on a per-item basis in the iiif > items block. Values accepted as a format parameter are as follows:

Format MIME Type
jpg default image/jpeg
tif image/tiff
png image/png
gif image/gif
jp2 image/jp2
pdf application/pdf
webp image/webp

Presentation API

Cineast can download all image resources in all the canvases of a manifest file. All downloaded images will be saved to a folder named after the manifest inside the path specified in the config file.

Presentation API versions supported by Cineast:

  • 2.1.1

Configuring a Presentation API Extraction job

A Presentation API job can be configured in the manifestUrl variable in the iiif object block of the configuration file. Here is an extract from the example configuration file.

Important: Any region, size, rotation, quality or format parameters specified in the iiif block are not applied to Presentation API items.

"manifestUrl": "https://dms-data.stanford.edu/data/manifests/Parker/bg021sq9590/manifest.json"

Cineast will parse this manifest file and start downloading all images in the various canvases of the sequences object.

Change Discovery API

Cineast can process “Create” type changes and download all the images from the manifests specified in that change.

Change Discovery API versions supported by Cineast:

  • 1.0

Configuring a Change Discovery API Extraction job

A Change Discovery API job can be configured in the orderedCollectionUrl variable in the iiif object block of the configuration file. Here is an extract from the example configuration file.

Important: Any region, size, rotation, quality or format parameters specified in the iiif block are not applied to Change Discovery API items.

"orderedCollectionUrl": "https://haab-digital.klassik-stiftung.de/viewer/api/v1/records/changes/"

Cineast will parse each page of Ordered Collections and will use the Presentation API to download all the images in the manifests.

Clone this wiki locally