Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional fields to Project Page metadata #83

Open
wgardiner opened this issue Sep 11, 2024 · 7 comments
Open

Add additional fields to Project Page metadata #83

wgardiner opened this issue Sep 11, 2024 · 7 comments

Comments

@wgardiner
Copy link

wgardiner commented Sep 11, 2024

To enable implementation of third-party marketplaces we need to support additional metadata. Below is an example of the fields I propose adding to enable the implementation of the design here

    "regen:hasComplianceCredits": true,
    "regen:bioregion": ['amazon'], // 'amazon', 'andes', 'carribean', 'orinoco', 'pacific'
    "regen:biomeType": [],
    "regen:watershed": ['cauqua'],
    "regen:subWatershed": [],
    "regen:ecosystemType": ['cloud forest'], // 'cloud forest', 'tropical savannah', 'tropical dry forest', 'tropical forest'
    "regen:environmentalAuthority": {
        "@type": "schema:Organization",
        "schema:name": "Corantioquia",
    },
    "regen:admininstrativeArea": { // edit: added
        "@type": "schema:AdministrativeArea",
        "schema:name": "Antiquoa"
    },
    "regen:offchainCreditsInfo": {
        "regen:creditsRegistered": {
            "qudt:unit": "unit:HA",
            "qudt:numericValue": 1000
        },
        "regen:creditsAvailable": {
            "qudt:unit": "unit:HA",
            "qudt:numericValue": 900
        },
        "regen:creditsRetired": {
            "qudt:unit": "unit:HA",
            "qudt:numericValue": 100
        },
    }
  • regen:hasComplianceCredits is a boolean that indicates whether a project is part of a compliance marketplace. We can assume that projects with this field unset or false are voluntary. Although to start all compliance credits will be offchain projects, in the future we may wish to support on chain credits that are part of a compliance market.
  • regen:bioregion is a list of strings, to support that a project may belong to multiple bioregions. @clevinson and I discussed leveraging the region from context field on schema:location (populated from Mapbox Places API geocoding) of Project Page metadata to query a bioregion, but this doesn't seem practical.
  • regen:biomeType is a list of strings representing the biome types in the project. We may be able to find or create an enum of allowed values for this field.
  • regen:watershed is a list of strings representing the names of the watersheds included in the project. A project may span multiple watersheds.
  • regen: subWatershed is a list of strings representing the names of the sub-watersheds included in the project.
  • regen:environmentalAuthority is a Schema.org Organization associated with the project.
  • regen:offchainCreditsInfo contains the information for credits to be displayed on the project page without needing to have inventory managed on chain. This is necessary for our initial compliance projects. Each value is stored as a qudt unit and value for flexibility.
  • regen:administrativeArea (edit: added) is a Schema.org AdministrativeArea that indicates the associated geographic region, for example a municipality, state, or county. This could also be represented as a DBpedia AdministrativeRegion

Additional thoughts:

  • I did some brief searching to find existing ontologies for watersheds, bioregions, and ecosystem types. I can dive in deeper if we feel there are advantages to making use of an existing ontology. Currently it's not clear what the benefit would be.
  • My hope is that all these fields are generic enough that they are broadly applicable and don't need to be specific to a given third party marketplace.
  • bioregion, biome, watershed, subwatershed, and ecosystem type feel closely related and could be logically grouped, but we already support ecosystem type at the root of Project metadata, so I think keeping them at the root feels a little more consistent.
@blushi
Copy link
Member

blushi commented Sep 11, 2024

Looks good.

One question regarding regen:offchainCreditsInfo naming, do we expect those credits to always be compliance credits or could that be something else, which would explain why we use a more generic term offchain rather than compliance?

What about the TEBU factors related data, as show on https://www.figma.com/design/brkxGV5qNOkZUp0YQl1cxO/Terrasos-Phase-1?node-id=717-90878&node-type=frame&t=KdlrWtmX6iOUw9fr-0 ? I guess description/labels could be stored in sanity (like "Ecosystem with declining areas, severe degradation, and disrupted processes facing a very high collapse risk.") but the actual values should probably be in the project metadata as well.

Also looking at the compliance related info, I think we're missing "department" (https://www.figma.com/design/brkxGV5qNOkZUp0YQl1cxO/Terrasos-Phase-1?node-id=661-81286&node-type=frame&t=KdlrWtmX6iOUw9fr-0)

Last, I know we are using schema.org quite extensively in our existing project metadata but we wanted to rather move to using DCMI (like we did for data post) so not entirely sure if we should keep using it here.

@wgardiner
Copy link
Author

wgardiner commented Sep 11, 2024

@blushi Yes, the reason I named it regen:offchainCreditsInfo was to generalize, because in the future there's no reason we couldn't have on chain compliance credits, and Cory mentioned that Terrasos may want this at some point. This way it's decoupled from the type of credit, it's just indicating that inventory is not managed on chain.

Thanks for catching the missing Department field, I updated my original example to include it. I'll follow up with a revised example that includes the TEBU fields, I missed those on my first pass.

Regarding Schema.org versus alternatives I think it depends on the data that we're representing. For original content or data I see value in using ontologies and vocabularies that provide flexibility and facilitate collaboration with others, which seems to be the argument made in the discussion of Data Posts. I don't see a reason to eliminate use of Schema.org in appropriate situations, however.

I'm offering to use Schema.org ontologies in two places:

  1. regen:environmentalAuthority: here I think Schema.org Organization offers a convenient structure that's well defined and widely used, we could alternatively use FOAF Organization, DBPedia Organization, or W3C Organization. Dublin Core doesn't seem to have a concept of an Organization, but we could also choose to use the more abstract dcterms:Agent.
  2. regen:administrativeArea: here I proposed Schema.org AdministrativeArea, but DBPedia AdministrativeRegion also feels appropriate.

@wgardiner
Copy link
Author

wgardiner commented Sep 12, 2024

Here's a revised version of the additional Project Page metadata

{
    "regen:hasComplianceCredits": true,
    "regen:bioregion": ["regen:Amazon"], // as terms? "regen:Amazon", "regen:Andes", "regen:Carribean", "regen:Orinoco", "regen:Pacific"
    "regen:biomeType": [], // define enum of type terms
    "regen:watershed": ["Cauqua"], 
    "regen:subWatershed": [],
    "regen:ecosystemType": ["regen:CloudForest"], // 'regen:CloudForest', 'regen:TropicalSavannah', 'regen:TropicalDryForest', 'regen:TropicalForest', etc
    "regen:environmentalAuthority": {
        "@type": "schema:Organization",
        "schema:name": "Corantioquia",
    },
    "regen:offchainCreditsInfo": {
        "regen:creditsRegistered": {
            "qudt:unit": "unit:HA",
            "qudt:numericValue": 1000
        },
        "regen:creditsAvailable": {
            "qudt:unit": "unit:HA",
            "qudt:numericValue": 900
        },
        "regen:creditsRetired": {
            "qudt:unit": "unit:HA",
            "qudt:numericValue": 100
        },
    },

    // Additional data for populating the TEBU Factors section

    "regen:projectDuration": {
        "xsd:duration": "P10Y"
    },
    "regen:conservationStatus": "iucn:Endangered", // iucn:Collapsed, iucn:Endangered, iucn:Vulnerable, iucn:NearThreatened, iucn:LeastConcern, iucn:DataDeficient, iucn:NotEvaluated
    "regen:ecologicalConnectivityIndex": {
        "qudt:unit": "unit:Dimensionless",
        "qudt:numericValue": 88
    },
    "regen:socialCulturalIndex": {
        "qudt:unit": "unit:Dimensionless",
        "qudt:numericValue": 0.2,
    },
    "regen:managementAreas": [
        {
            "regen:activity": ["regen:Conservation"], // https://daf2e860.regen-data-standards.pages.dev/activity/conservation/
            "dcterms:extent": {
                "qudt:unit": "unit:HA",
                "qudt:numericValue": 50
            }
        },
        {
            "regen:activity": ["regen:EcosystemRestoration"], // https://daf2e860.regen-data-standards.pages.dev/activity/ecosystemrestoration/
            "dcterms:extent": {
                "qudt:unit": "unit:HA",
                "qudt:numericValue": 50
            }
        }
    ]
}

Changes to previous fields:

  • regen:ecosystemType are now represented as Regen Data Standards term from the Environment Type Taxonomy (we may need to add new terms). Ideally we'll use enumerated terms for any data that's relevant to the application (like filters) as opposed to being used for display only.

New fields for TEBU Factors section:

  • regen:projectDuration is an xsd:duration field indicating the duration of the project in ISO8601 Duration standard. It will be used to populate the Terrasos duration widget.
  • regen:conservationStatus is one of the IUCN Red List Ecosystem Risk Categories. I could not find an official RDF vocabulary of these terms to use so we'll need to define it. Possible values are iucn:Collapsed, iucn:Endangered, iucn:Vulnerable, iucn:NearThreatened, iucn:LeastConcern, iucn:DataDeficient, or iucn:NotEvaluated. This value will be used to populate the "Threat Category of Ecosystem" section.
  • regen:ecologicalConnectivityIndex is a unitless value to indicate the potential contribution to landscape connectivity of a given project. For reference and definition of ecological connectivity see Terrasos' methodology page 24. I could imagine other projects using a similar metric, though their calculation will likely be different, defining a custom unit like regen:TerrasosEcologicalConnectivityIndex could improve flexibility. This value will be used to populate the "Ecological Connectivity Level" section.
  • regen:socialCulturalIndex is a unitless indicator of social and cultural impact of the project. This field is not present in the Terrasos Methodology. I could use more information here. I'm curious how other projects could use a similar field.
  • regen:managementAreas is a list of areas that can be logically grouped in the project. These items will be used populate the "Project Area Actions" section widget if they contain dcterms:extent with the area's size, and a regen:activity, which references a Regen Data Standards Activity. Values for the widget are either regen:Conservation or regen:EcosystemRestoration, which map to Terrasos' preservation and restoration activities, respectively (see page 15 of their methodology). Other future projects might find it useful to append additional information such as geometry collections to these list items.

@blushi
Copy link
Member

blushi commented Sep 16, 2024

Nice work!

Just a few comments:

Not sure if that makes sense to have regen:bioregion as a term from regen schema, although I agree it would be relevant for regen:ecosystemType. The only thing is that the existing metadata we have that are using regen:ecosystemType just use simple strings but maybe that could work as long as both types of values are well documented.

For durations, we've been using schema:Duration type historically https://schema.org/Duration I don't have a preference over one or the other but I think we should be consistent as much as possible or make sure such differences are well justified and documented.

Does unit:Dimensionless refers to https://qudt.org/vocab/quantitykind/Dimensionless? I think now that we have a good start for these additional metadata fields, it would make sense to provide the @context as well.

@wgardiner
Copy link
Author

wgardiner commented Sep 20, 2024

I've written up a LinkML schema yaml and example data for the proposed Terrasos fields and used them to generate this JSON-LD example output with context. Here's my PR

I changed bioregion to be a string, switched duration to schema:Duration, and organization to regen:Organization

{
  "id": "http://dev.app.regen.com/projects/38",
  "environmentalAuthority": {
    "name": "Corantioquia",
    "url": "http://corantioquia.gov.co/"
  },
  "marketType": "rfs:ComplianceMarket",
  "bioregion": [
    "Amazon Basin"
  ],
  "biomeType": [
    "TropicalForest"
  ],
  "watershed": [
    "Amazon River"
  ],
  "subWatershed": [
    "Upper Amazon"
  ],
  "ecosystemType": [
    "rfs:CloudForest"
  ],
  "offchainCreditsInfo": {
    "creditsRegistered": {
      "numericValue": 1000.0,
      "unit": "unit:HA"
    },
    "creditsAvailable": {
      "numericValue": 800.0,
      "unit": "unit:HA"
    },
    "creditsRetired": {
      "numericValue": 200.0,
      "unit": "unit:HA"
    }
  },
  "projectDuration": "P1Y",
  "projectDurationMinimum": "P6M",
  "projectDurationMaximum": "P2Y",
  "managementAreas": [
    {
      "activity": [
        "rfs:Conservation"
      ],
      "extent": {
        "numericValue": 120.0,
        "unit": "unit:HA"
      }
    },
    {
      "activity": [
        "rfs:EcosystemRestoration"
      ],
      "extent": {
        "numericValue": 220.0,
        "unit": "unit:HA"
      }
    }
  ],
  "ecologicalConnectivityIndex": {
    "numericValue": 75.5,
    "unit": "unit:UNITLESS"
  },
  "socialCulturalIndex": {
    "numericValue": 85.0,
    "unit": "unit:UNITLESS"
  },
  "administrativeArea": {
    "name": "Antioquia"
  },
  "@type": "ProjectPage",
  "@context": {
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "dcterms": "http://purl.org/dc/terms/",
    "linkml": "https://w3id.org/linkml/",
    "qudt": "http://qudt.org/schema/qudt/",
    "rfs": "https://framework.regen.network/schema/",
    "schema": "http://schema.org/",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "unit": {
      "@id": "qudt:unit"
    },
    "@vocab": "https://framework.regen.network/schema/",
    "activity": {
      "@context": {
        "text": "skos:notation",
        "description": "skos:prefLabel",
        "meaning": "@id"
      },
      "@id": "activity"
    },
    "administrativeArea": {
      "@type": "@id",
      "@id": "schema:AdministrativeArea"
    },
    "biomeType": {
      "@id": "biomeType"
    },
    "bioregion": {
      "@id": "bioregion"
    },
    "ecologicalConnectivityIndex": {
      "@type": "@id",
      "@id": "ecologicalConnectivityIndex"
    },
    "ecosystemType": {
      "@context": {
        "text": "skos:notation",
        "description": "skos:prefLabel",
        "meaning": "@id"
      },
      "@id": "ecosystemType"
    },
    "environmentalAuthority": {
      "@type": "@id",
      "@id": "environmentalAuthority"
    },
    "extent": {
      "@type": "@id",
      "@id": "dcterms:extent"
    },
    "id": "@id",
    "managementAreas": {
      "@type": "@id",
      "@id": "managementAreas"
    },
    "marketType": {
      "@context": {
        "text": "skos:notation",
        "description": "skos:prefLabel",
        "meaning": "@id"
      },
      "@id": "marketType"
    },
    "name": {
      "@id": "schema:name"
    },
    "offchainCreditsInfo": {
      "@type": "@id",
      "@id": "offchainCreditsInfo"
    },
    "creditsAvailable": {
      "@type": "@id",
      "@id": "creditsAvailable"
    },
    "creditsRegistered": {
      "@type": "@id",
      "@id": "creditsRegistered"
    },
    "creditsRetired": {
      "@type": "@id",
      "@id": "creditsRetired"
    },
    "projectDuration": {
      "@id": "projectDuration"
    },
    "projectDurationMaximum": {
      "@id": "projectDurationMaximum"
    },
    "projectDurationMinimum": {
      "@id": "projectDurationMinimum"
    },
    "numericValue": {
      "@type": "xsd:float",
      "@id": "qudt:numericValue"
    },
    "showOnProjectPage": {
      "@type": "xsd:boolean",
      "@id": "showOnProjectPage"
    },
    "socialCulturalIndex": {
      "@type": "@id",
      "@id": "socialCulturalIndex"
    },
    "subWatershed": {
      "@id": "subWatershed"
    },
    "url": {
      "@id": "schema:url"
    },
    "watershed": {
      "@id": "watershed"
    },
    "AdministrativeArea": {
      "@id": "schema:AdministrativeArea"
    },
    "ManagementArea": {
      "@id": "ManagementArea"
    },
    "OffchainCreditsInfo": {
      "@id": "OffchainCreditsInfo"
    },
    "Organization": {
      "@id": "Organization"
    },
    "ProjectPage": {
      "@id": "ProjectPage"
    },
    "QuantityValue": {
      "@id": "qudt:QuantityValue"
    }
  }
}

@blushi
Copy link
Member

blushi commented Sep 24, 2024

Looks good, could you recreate your PR now that @paul121's one has been merged?

@wgardiner
Copy link
Author

Sure thing, I just recreated the PR here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants