Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "IsDerivedFrom" to DOI metadata #2119

Merged
merged 1 commit into from
Oct 24, 2023
Merged

Add "IsDerivedFrom" to DOI metadata #2119

merged 1 commit into from
Oct 24, 2023

Conversation

bemoody
Copy link
Collaborator

@bemoody bemoody commented Oct 23, 2023

If a published project is derived from other published projects on the site, include that information in the DOI metadata.

Formally, "IsDerivedFrom indicates B [the related resource] is a source upon which A [the resource being registered] is based. IsDerivedFrom should be used for a resource that is a derivative of an original resource."

https://schema.datacite.org/meta/kernel-4.4/doc/DataCite-MetadataKernel_v4.4.pdf (page 63)

This changes the metadata for newly created DOIs. We should update the metadata for existing projects too; I don't know if we have a process for doing that.

If a published project is derived from other published projects on the
site, include that information in the DOI metadata.

Formally, "IsDerivedFrom indicates B [the related resource] is a
source upon which A [the resource being registered] is based.
IsDerivedFrom should be used for a resource that is a derivative of
an original resource."

https://schema.datacite.org/meta/kernel-4.4/doc/DataCite-MetadataKernel_v4.4.pdf
(page 63)
@bemoody bemoody force-pushed the doi-is-derived-from branch from e6a9536 to a302aa1 Compare October 23, 2023 18:43
@tompollard
Copy link
Member

We should update the metadata for existing projects too; I don't know if we have a process for doing that.

Currently just a manual process (clicking the "update metadata" buttons on the project management page in the admin console):

Screenshot 2023-10-24 at 10 12 30 AM

@tompollard
Copy link
Member

Looks good to me, thanks!

@tompollard tompollard merged commit 2f80538 into dev Oct 24, 2023
8 checks passed
@tompollard tompollard deleted the doi-is-derived-from branch October 24, 2023 14:14
@tompollard
Copy link
Member

I tested a manual push to DataCite for the following project:
https://physionet.org/content/cxr-lt-iccv-workshop-cvamd/1.1.0/

Before:

<?xml version="1.0" encoding="UTF-8"?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
  <identifier identifierType="DOI">10.13026/C4TR-KR83</identifier>
  <creators>
    <creator>
      <creatorName>Holste, Gregory</creatorName>
      <givenName>Gregory</givenName>
      <familyName>Holste</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">https://orcid.org/0000-0002-5657-3081</nameIdentifier>
    </creator>
    <creator>
      <creatorName>Wang, Song</creatorName>
      <givenName>Song</givenName>
      <familyName>Wang</familyName>
    </creator>
    <creator>
      <creatorName>Jaiswal, Ajay</creatorName>
      <givenName>Ajay</givenName>
      <familyName>Jaiswal</familyName>
    </creator>
    <creator>
      <creatorName>Yang, Yuzhe</creatorName>
      <givenName>Yuzhe</givenName>
      <familyName>Yang</familyName>
    </creator>
    <creator>
      <creatorName>Lin, Mingquan</creatorName>
      <givenName>Mingquan</givenName>
      <familyName>Lin</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">https://orcid.org/0000-0003-0862-6588</nameIdentifier>
    </creator>
    <creator>
      <creatorName>Peng, Yifan</creatorName>
      <givenName>Yifan</givenName>
      <familyName>Peng</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">https://orcid.org/0000-0001-9309-8331</nameIdentifier>
    </creator>
    <creator>
      <creatorName>Wang, Atlas</creatorName>
      <givenName>Atlas</givenName>
      <familyName>Wang</familyName>
    </creator>
  </creators>
  <titles>
    <title>CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays</title>
  </titles>
  <publisher>PhysioNet</publisher>
  <publicationYear>2023</publicationYear>
  <resourceType resourceTypeGeneral="Dataset"/>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.13026/5mbq-9k80</relatedIdentifier>
  </relatedIdentifiers>
  <sizes/>
  <formats/>
  <version>1.1.0</version>
  <descriptions>
    <description descriptionType="Abstract">Many real-world problems, including diagnostic medical imaging exams, are
"long-tailed" - there are a few common findings followed by more relatively
rare conditions. In chest radiography, diagnosis is both a **long-tailed** and
**multi-label** problem, as patients often present with multiple disease
findings simultaneously. This is distinct from most large-scale image
classification benchmarks, where each image only belongs to one label and the
distribution of labels is relatively balanced. While researchers have begun to
study the problem of long-tailed learning in medical image recognition, few
have studied its interplay with label co-occurrence. This competition will
provide a challenging large-scale multi-label long-tailed learning task on
chest X-rays (CXRs), encouraging community engagement with this emerging
interdisciplinary topic. This project contains labels for the CXR-LT 2023
competition dataset, containing 377,110 CXRs from 26 classes, and a related
subset used in the MICCAI 2023 paper,  "How Does Pruning Impact Multi-Label
Long-Tailed Learning?" containing 257,018 frontal CXRs from 19 classes.

</description>
  </descriptions>
</resource>

After, with successful addition of the IsDerivedFrom fields:

<?xml version="1.0" encoding="UTF-8"?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
  <identifier identifierType="DOI">10.13026/C4TR-KR83</identifier>
  <creators>
    <creator>
      <creatorName>Holste, Gregory</creatorName>
      <givenName>Gregory</givenName>
      <familyName>Holste</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">https://orcid.org/0000-0002-5657-3081</nameIdentifier>
    </creator>
    <creator>
      <creatorName>Wang, Song</creatorName>
      <givenName>Song</givenName>
      <familyName>Wang</familyName>
    </creator>
    <creator>
      <creatorName>Jaiswal, Ajay</creatorName>
      <givenName>Ajay</givenName>
      <familyName>Jaiswal</familyName>
    </creator>
    <creator>
      <creatorName>Yang, Yuzhe</creatorName>
      <givenName>Yuzhe</givenName>
      <familyName>Yang</familyName>
    </creator>
    <creator>
      <creatorName>Lin, Mingquan</creatorName>
      <givenName>Mingquan</givenName>
      <familyName>Lin</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">https://orcid.org/0000-0003-0862-6588</nameIdentifier>
    </creator>
    <creator>
      <creatorName>Peng, Yifan</creatorName>
      <givenName>Yifan</givenName>
      <familyName>Peng</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">https://orcid.org/0000-0001-9309-8331</nameIdentifier>
    </creator>
    <creator>
      <creatorName>Wang, Atlas</creatorName>
      <givenName>Atlas</givenName>
      <familyName>Wang</familyName>
    </creator>
  </creators>
  <titles>
    <title>CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays</title>
  </titles>
  <publisher>PhysioNet</publisher>
  <publicationYear>2023</publicationYear>
  <resourceType resourceTypeGeneral="Dataset"/>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.13026/5mbq-9k80</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsDerivedFrom">10.13026/8360-t248</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsDerivedFrom">10.13026/C2JT1Q</relatedIdentifier>
  </relatedIdentifiers>
  <sizes/>
  <formats/>
  <version>1.1.0</version>
  <descriptions>
    <description descriptionType="Abstract">Many real-world problems, including diagnostic medical imaging exams, are
"long-tailed" - there are a few common findings followed by more relatively
rare conditions. In chest radiography, diagnosis is both a **long-tailed** and
**multi-label** problem, as patients often present with multiple disease
findings simultaneously. This is distinct from most large-scale image
classification benchmarks, where each image only belongs to one label and the
distribution of labels is relatively balanced. While researchers have begun to
study the problem of long-tailed learning in medical image recognition, few
have studied its interplay with label co-occurrence. This competition will
provide a challenging large-scale multi-label long-tailed learning task on
chest X-rays (CXRs), encouraging community engagement with this emerging
interdisciplinary topic. This project contains labels for the CXR-LT 2023
competition dataset, containing 377,110 CXRs from 26 classes, and a related
subset used in the MICCAI 2023 paper,  "How Does Pruning Impact Multi-Label
Long-Tailed Learning?" containing 257,018 frontal CXRs from 19 classes.

</description>
  </descriptions>
</resource>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants