Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[June] [Pubs] Curation Workflow Tracking #104

Open
5 of 20 tasks
aditigopalan opened this issue Jun 3, 2024 · 3 comments
Open
5 of 20 tasks

[June] [Pubs] Curation Workflow Tracking #104

aditigopalan opened this issue Jun 3, 2024 · 3 comments
Assignees

Comments

@aditigopalan
Copy link
Contributor

aditigopalan commented Jun 3, 2024

This ticket tracks curation workflow progression.

Note: It is possible for work to take place simultaneously in the three sections with overlapping periods, allowing curation workflows for different months to coincide.

1. Curation and Annotation

  • Run Pubmed crawler to generate PublicationView manifest [205 publications generated, long sprint anticipated]
  • Send Amber and Jineta a copy of the PublicationView manifest from latest crawl to review for MC2 Center Newsletter publication highlights
  • Send Amber "News from CCKP" for MC2 Center Newsletter
  • Annotate publications in PublicationView manifest [In progress]
  • Generate ToolView and DatasetView manifests based on PublicationView manifest
  • Run the automated curation workflow to upload publications, datasets and tools [This includes splitting manifests, processing and validating manifests, generating target synapse IDs for upload, schema updates, upload to synapse and (in progress) a validation check for uploads)
  • Generate UNION tables
  • QC of staging tables
  • Performing automate portal sync to CCKP
  • Validate data on the CCKP

Status check [Plan to report numbers for each category following pubmed crawl]:

  • Publication upload [ ]
  • Tool upload [ ]
  • Data set upload [ ]

2. Data model

  • Update valid values in the data model and build
  • Generate templates from new model
  • Release new model version [no changes, no release]
  • Update DCA config [no changes, no update]

3. Contributor Engagement

  • Emails to contributors (by grant) with info on newly added manifests, link to project, DCA, and instructions on review, annotation, validation, and submission, as applicable
  • Jira Help Desk ticket tracking, review, triage, with data model updates, as needed (TBD)
  • Annotation gap-filling of DatasetView and ToolView manifests
@aclayton555
Copy link

24-6 close out: aiming to upload May, April and June by mid 24-7/8 sprint. Will likely include a data model release during mid sprint

@aclayton555
Copy link

mid-sprint:

  • Pending Annotations still growing? @jaybee84 Check which resources have this. Not that "Pending Annotations" was historically implemented with publications that were not yet open access. But these are now also pushed to portal as an incentive for adding annotations to these resources.
  • resume emails to contributors with existing documentation. Maybe targeted engagement to certain contributor

@jaybee84
Copy link
Collaborator

Publications: pending annotation for tumor type + assay (as discussed in the meeting mostly due to restricted access, but some on open access)
Datasets : pending annotations for tissue

@aditigopalan aditigopalan changed the title [June] Curation Workflow Tracking [June] [Pubs] Curation Workflow Tracking Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants