-
Notifications
You must be signed in to change notification settings - Fork 50
Working with IIIF
Cineast supports performing extraction on IIIF resources hosted on compatible media servers. During the feature extraction process, Cineast will automatically download all the resources specified in the extraction config file to a folder on the local filesystem and perform extraction on it.
To extract features from IIIF resources for use with Cineast, an extraction job config JSON file must be used. This file must list out various IIIF job specific parameters along with other general parameters that are required by all types of Cineast extraction jobs (Extractors, Exporters, Database etc). For more information regarding configuration of an extraction job in general, please refer to Extraction Configuration. We will be using a shortened version of an IIIF specific example configuration file to explain what the different parameters mean. The complete config file can be found here.
{
"input": {
"path": "path/to/download/directory",
...
"iiif": {
"imageApiUrl": "...",
"imageApiVersion": "2.1.1",
"keepImagesPostExtraction": true,
"items": [...],
],
"manifestUrl": "...",
"orderedCollectionUrl": "..."
}
},
"metadata": [
{
"name": "IIIFMetaDataExtractor"
}
],
...
}
The path parameter controls where the IIIF resources will be saved when they are downloaded before extraction begins. This value will only be used if keepImagesPostExecution
is set to true
. If a path
has not been specified, Cineast will store the resources in a directory named iiif-media-${System.currentTimeMillis()}
.
By default, Cineast will delete the IIIF resources post extraction. If this parameter's value is set to true
then instead of deleting these resources, Cineast will save them to the directory specified at "input"
> "path"
.
See Configuring an Image API Extraction Job
See Configuring a Presentation API Extraction job
See Configuring a Change Discovery API Extraction job
The metadata
array is used to configure the MetaDataExtractors that will read the metadata files corresponding to every media resource downloaded during an IIIF extraction job. This array should contain a key-value pair as follows:
{
"name": "IIIFMetaDataExtractor"
}
IIIFMetaDataExtractor
reads the .iiif
metadata files generated by Cineast when it is downloading your IIIF resources. This metadata is stored to the database you configured and allows vitrivr-ng to fetch IIIF resources directly from their parent IIIF servers without using the local copy that Cineast may or may not have.
Cineast can download individual image resources in any region, size, rotation, format and quality; provided that these parameters are supported by the server.
Image API versions supported by Cineast:
- 2.1.1
- 3.0
An Image API job can be configured in the iiif
object block of the configuration file. Here is an extract from the example configuration file.
"iiif": {
"imageApiUrl": "https://libimages.princeton.edu/loris/pudl0001/5138415",
"imageApiVersion": "2.1.1",
"keepImagesPostExtraction": true,
"region": "full",
"size": "full",
"rotation": 10,
"quality": "default",
"format": "jpg",
"items": [
{
"identifier": "00000010.jp2",
"region": "square"
},
{
"identifier": "00000011.jp2",
"rotation": 180,
"quality": "bitonal",
"format": "png"
}
],
"manifestUrl": "https://dms-data.stanford.edu/data/manifests/Parker/bg021sq9590/manifest.json",
"orderedCollectionUrl": "https://haab-digital.klassik-stiftung.de/viewer/api/v1/records/changes/"
}
The Image API URLs generated by Cineast based on this config are:
- https://libimages.princeton.edu/loris/pudl0001/5138415/00000010.jp2/square/full/10/default.jpg
- https://libimages.princeton.edu/loris/pudl0001/5138415/00000011.jp2/full/full/180/bitonal.png
Optional parameter used to set the path of the Image API resource, If a single Image API resource is required then this can contain the complete path pointing to the resource (baseUrl and identifier). If multiple resources have to be downloaded from the same server, then this should contain the common baseUrl shared by all the resources. More info
Optional parameter used to specify the version of the Image API supported by the server. If the ImageApiVersion
is not specified then Cineast will automatically try to determine the highest version of the API supported by the server. Thus, it is okay to omit this if the version is not known or if the resources are available at varying API levels.
Optional parameter to specify the region of the Image API resources to be downloaded. This value can be overridden on a per-item basis in the iiif
> items
block. Values accepted as a region
parameter are as follows:
Value | Description |
---|---|
full default
|
The full image is returned, without any cropping. |
square |
The region is defined as an area where the width and height are both equal to the length of the shorter dimension of the full image. The region may be positioned anywhere in the longer dimension of the full image at the server’s discretion, and centered is often a reasonable default. |
Optional parameter to specify the size of the Image API resources to be downloaded. This value can be overridden on a per-item basis in the iiif
> items
block. Values accepted as a size
parameter are as follows:
Value | Description |
---|---|
full default for Image API < 3.0
|
The image or region is not scaled, and is returned at its full size. |
max default for Image API > 3.0
|
The extracted region is returned at the maximum size available, but will not be upscaled. |
Optional parameter that specifies rotation to be applied to the Image API resources before downloading. The numerical value represents the number of degrees of clockwise rotation, and may be any floating point number from 0 to 360. The default rotation value is 0
degrees. This value can be overridden on a per-item basis in the iiif
> items
block.
Optional parameter to specify the quality of the Image API resources to be downloaded. This value can be overridden on a per-item basis in the iiif
> items
block. Values accepted as a quality
parameter are as follows:
Value | Description |
---|---|
color |
The image is returned with all of its color information. |
gray |
The image is returned in grayscale, where each pixel is black, white or any shade of gray in between. |
bitonal |
The image returned is bitonal, where each pixel is either black or white. |
default default
|
The image is returned using the server’s default quality (e.g. color , gray or bitonal ) for the image. |
Optional parameter to specify the format of the Image API resources to be downloaded. This value can be overridden on a per-item basis in the iiif
> items
block. Values accepted as a format
parameter are as follows:
Format | MIME Type |
---|---|
jpg default
|
image/jpeg |
tif |
image/tiff |
png |
image/png |
gif |
image/gif |
jp2 |
image/jp2 |
pdf |
application/pdf |
webp |
image/webp |
Cineast can download all image resources in all the canvases of a manifest file. All downloaded images will be saved to a folder named after the manifest inside the path
specified in the config file.
Presentation API versions supported by Cineast:
- 2.1.1
A Presentation API job can be configured in the manifestUrl
variable in the iiif
object block of the configuration file. Here is an extract from the example configuration file.
Important: Any region
, size
, rotation
, quality
or format
parameters specified in the iiif
block are not applied to Presentation API items.
"manifestUrl": "https://dms-data.stanford.edu/data/manifests/Parker/bg021sq9590/manifest.json"
Cineast will parse this manifest file and start downloading all images in the various canvases
of the sequences
object.
Cineast can process “Create” type changes and download all the images from the manifests specified in that change.
Change Discovery API versions supported by Cineast:
- 1.0
A Change Discovery API job can be configured in the orderedCollectionUrl
variable in the iiif
object block of the configuration file. Here is an extract from the example configuration file.
Important: Any region
, size
, rotation
, quality
or format
parameters specified in the iiif
block are not applied to Change Discovery API items.
"orderedCollectionUrl": "https://haab-digital.klassik-stiftung.de/viewer/api/v1/records/changes/"
Cineast will parse each page of Ordered Collections and will use the Presentation API to download all the images in the manifests.
- Home
- Setup
- Environment Setup
- Getting Started
- Optional: Retrieval Setup Guide
- Research: Working with Existing Data
- Working with Multimedia Data
- Advanced
- API Documentation
- CLI