Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DD3 R7. Where possible, add Schema.org “corresponding element” entries to GEMINI elements #41

Open
PeterParslow opened this issue Apr 7, 2021 · 17 comments
Assignees
Labels
DD3 Recommendation from the Geospatial Commission Data Discoverability 3 project Elements Issue that primarily affects the GEMINI elements enhancement New feature or request

Comments

@PeterParslow
Copy link
Contributor

We have used W3C’s recommendations for mapping from ISO 19115 to Schema.org. This table summarises the Schema.org equivalence statements given for each element below.
Whilst there is no specific DD2 recommendation concerning DCAT, we believe a DCAT2 “equivalent element” for each GEMINI element would be useful, by supporting those whose web publication of GEMINI records uses DCAT as opposed to Schema.org. Where this is easily available from the same W3C source, we have included this below. You will see that the two vocabularies are very similar, but note that:
• some of the DCAT elements sit in the DCAT “distribution” section, not their “dataset”;
• many DCAT properties have structured content, so this is not a complete list of how to implement it; and
• there are many other DCAT properties that should also be used, beyond those that exist in Schema.org (e.g. conformsTo, creator, spatialResolutionInMeters, format).

GEMINI element Condition Schema.org DCAT/DCAT2[1] Notes
Title name dct:title
Dataset language inLanguage dct:language
Abstract description dct:description
Topic category keywords dct:subject
Keyword INSPIRE theme keywords dcat:theme / dct:subject
Keyword free text keywords dcat:keyword Schema.org puts all the ‘free text’ keywords in one value
Keyword Controlled list, URL Keywords.DefinedTerm.name
Use .description for the textual content of the Anchor or CodeList
Use .url for the target of the Anchor
dcat:keyword.DefinedTerm
Temporal extent temporalCoverage[2] dct:temporal
Dataset reference date 19115 dateType = publication datePublished dct:issued release date / issued
Dataset reference date 19115 dateType = revised dateModified update date / dct:modified
Lineage dct:provenance
Extent spatialCoverage.Place.name dct:spatial
Resource locator.linkage 19115 function = download contentURL (inside “distribution”) dcat:downloadURL
Resource locator.linkage 19115 function = “information”
Where the page links on to download
dcat:accessURL
Resource locator.linkage 19115 function = “information” url dcat:landingPage
Data format encodingFormat dct:format, Possibly also dcat:mediaType
Responsible organisation 19115 role = publisher publisher.Organization (with at least name, email, url) dct:publisher
Responsible organisation 19115 role = pointOfContact contactPoint (probably Organisation, with at least name, email, url) dcat:contactPoint
Use constraints Use constraints is being used to indicate a license license dct:license
Where GEMINI has an Anchor URL to the licence licence.CreativeWork
.abstract (with the free text) and .url (with the Anchor target URL)
Use constraints Other circumstances dct:accessRights
Bounding box spatialCoverage.geo.GeoShape.box dct:spatial Note: needs translating from four edges to two corners
Resource identifier identifier dct:identifier
Resource type rdf:type Note: DCAT-AP does not distinguish between datasets and dataset series
@PeterParslow PeterParslow added enhancement New feature or request Elements Issue that primarily affects the GEMINI elements DD3 Recommendation from the Geospatial Commission Data Discoverability 3 project labels Apr 7, 2021
@PeterParslow
Copy link
Contributor Author

The W3C mapping, on which this is largely based, is at https://www.w3.org/2015/spatial/wiki/ISO_19115_-_DCAT_-_Schema.org_mapping

@PeterParslow
Copy link
Contributor Author

Andrea Perego’s ISO 19139 - DCAT mapping in GitHub (James’ link) provides more detail e.g. the range of each element, and also maps somethings outside the DCAT namespace(s).

https://github.com/GeoCat/iso-19139-to-dcat-ap/blob/master/documentation/Mappings.md

(Thanks to James Reid)

@PeterParslow
Copy link
Contributor Author

PeterParslow commented Aug 30, 2023

Just been contacted by the CDDO data standards team looking to state how to describe "where" in DCAT metadata to be used in the UK government data marketplace. This will include updating the mapping above for DCAT v3.

See co-cddo/ukgov-metadata-exchange-model#1

@nmtoken
Copy link
Contributor

nmtoken commented Aug 30, 2023

Should we also adapt the GEMINI mapping to DCAT 3 as this now includes better description of dataset series?

@PeterParslow
Copy link
Contributor Author

That will be a necessary part of the CDDO work; I'll make sure it is available as an update to this GEMINI change request. It's also being discussed (& likely to happen) in the OGC GeoDCAT SWG.

@PeterParslow
Copy link
Contributor Author

Need to annotate this to show how it aligns (or not!) with the UK Cross-Government Metadata Exchange Model which may be re-branded as a UK Application Profile of DCAT

@archaeogeek
Copy link
Member

archaeogeek commented May 1, 2024

@PeterParslow to update table, then @archaeogeek to update elements with equivalent mappings, also publish this table as guidance

@PeterParslow
Copy link
Contributor Author

We'll also need to include guidance or at least comment on converting GEMINI to DCAT covering how many dcat distributions to create (depending on e.g. GEMINI Use constraints & Resource locators).

Revised table, with extra columns for DCAT v3 & UK government metadata exchange model. Note, the UK Gov work is supposed to consider adding spatial & some other things; they also plan to convert it to a full AP of DCAT v3.

GEMINI element Condition Schema.org DCAT/DCAT2[1] Notes DCAT3 UK Gov MXM
Title name dct:title Y Y
Dataset language inLanguage dct:language Y N
Abstract description dct:description Y Y
Topic category keywords dct:subject Y N
Keyword INSPIRE theme keywords dcat:theme / dct:subject DCAT3 expects theme to be used when the target is a SKOS concept; subject in the more general case, whether or not the term is from a controlled vocab Y dcat:theme
Keyword free text keywords dcat:keyword Schema.org puts all the ‘free text’ keywords in one value; DCAT / MXM keyword are 'uncontrolled' literals Y Y
Keyword Controlled list, URL Keywords.DefinedTerm.name
Use .description for the textual content of the Anchor or CodeList
Use .url for the target of the Anchor
dcat:keyword.DefinedTerm dcat:theme? dcat:theme
Temporal extent temporalCoverage[2] dct:temporal Y N proposed
Dataset reference date 19115 dateType = publication datePublished dct:issued release date / issued Y Y
Dataset reference date 19115 dateType = revised dateModified update date / dct:modified Y Y
Lineage dct:provenance Uses PROV N
Extent spatialCoverage.Place.name dct:spatial if available as a link Y N proposed
Resource locator.linkage 19115 function = download contentURL (inside “distribution”) dcat:downloadURL Y Y
Resource locator.linkage 19115 function = “information”
Where the page links on to download
dcat:accessURL of a dcat:Distribution? Y N
Resource locator.linkage 19115 function = “information” url dcat:landingPage Y N
Data format encodingFormat dct:format of a dcat:Distribution Possibly also dcat:mediaType Y N
Responsible organisation 19115 role = publisher publisher.Organization (with at least name, email, url) dct:publisher Y Y
Responsible organisation 19115 role = pointOfContact contactPoint (probably Organisation, with at least name, email, url) dcat:contactPoint dcat:contactPoint is a vCard Y must contain email & contactName (organisation)
Use constraints Use constraints is being used to indicate a licence license dct:license license is a property of a distribution Y Y licence
Use constraints Where GEMINI has an Anchor URL to the licence licence.CreativeWork
.abstract (with the free text) and .url (with the Anchor target URL)
Y Y
Use constraints Other circumstances dct:accessRights accessRights is a property of the dataset Y Y
Bounding box spatialCoverage.geo.GeoShape.box dct:spatial.dct:Location.dct:bbox Note: needs translating from four edges to two corners Y N
Resource identifier identifier dct:identifier Y Y
Resource type rdf:type cataloguedResource is either Dataset or DataService; Note: DCAT-AP does not distinguish between datasets and dataset series; DCATv3 does The CataloguedResource can be either Dataset, DatasetSeries, or DataService Y

@archaeogeek
Copy link
Member

@PeterParslow what do I need to do next? I can't remember...

@PeterParslow
Copy link
Contributor Author

@PeterParslow what do I need to do next? I can't remember...

See if what I've come up with in a desk exercise matches what you'd expect from the GeoNetwork implementation of DCAT?

@nmtoken
Copy link
Contributor

nmtoken commented Jun 18, 2024

@archaeogeek do you have a link to where this transformation is mapped in GeoNetwork 4. It is available (in theory) though the OGC API - Records interface, though links aren't working for us

@archaeogeek
Copy link
Member

@nmtoken it's not the mapping. We have it working here: https://spatialdata.gov.scot/geonetwork/api/collections/main/items/fa510351-8e30-4147-b984-862be84a6f90. You need to check the log files- I suspect you're missing the relevant xsl files in https://github.com/geonetwork/geonetwork-microservices/tree/main/modules/services/ogc-api-records/src/main/resources/xslt/ogcapir/formats/copy (which is completely undocumented). Basically you need a gemini one that matches the iso19139 one

@nmtoken
Copy link
Contributor

nmtoken commented Jun 19, 2024

Not the headers then (geonetwork/geonetwork-microservices#114) ?

@archaeogeek
Copy link
Member

@nmtoken the above is all I had to do to get it working, YMMV.

@nmtoken
Copy link
Contributor

nmtoken commented Jul 2, 2024

@archaeogeek Just checking we are not talking at cross purposes, you seem to be saying that in your Tree Preservation Orders - Argyll and Bute example the fact that the schema.org, dcat, dcat_turtle, and geojson tabs link to content is becuase you have a gemini XSL file and we don't.

For us (for example https://metadata.bgs.ac.uk/geonetwork/api/collections/main/items/a2b1143b-5c5d-23d6-e054-002128a47908) and the EEA geospatial data catalogue (for example https://sdi.eea.europa.eu/catalogue/api/collections/main/items/71c47f78-27b6-4080-acd5-47b306b273d8) these tabs don't give any content (only errors).

@PeterParslow PeterParslow removed their assignment Jul 8, 2024
@archaeogeek
Copy link
Member

archaeogeek commented Jul 12, 2024

@nmtoken to be precise, what I'm saying is that the only change we ever needed to make to get a full set of working links in the ogc-api service is to add the gemini xsl record, eg adding a iso19139.gemini23 equivalent of the files here: https://github.com/geonetwork/geonetwork-microservices/tree/main/modules/services/ogc-api-records/src/main/resources/xslt/ogcapir/formats/copy, which is identical to the ones already included. This does trigger an error in the ogc-api service logs if you dig deep enough.

@PeterParslow
Copy link
Contributor Author

I no longer know what I meant by dcat:keyword.DefinedTerm! It's not something that seems to exist in DCAT (v2 or v3). Conveniently, that makes it clearer (to me) that dcat:keyword if for uncontrolled ones and dcat:theme is for controlled lists (on the assumption they are published as SKOS)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DD3 Recommendation from the Geospatial Commission Data Discoverability 3 project Elements Issue that primarily affects the GEMINI elements enhancement New feature or request
Projects
Status: In progress
Development

No branches or pull requests

3 participants