-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Synapse Datasets + Collections] Define and incorporate schema for Synapse Datasets #136
Comments
24-9: @aditya-nath-sage will pick this up and chat with @jaybee84 about aligning approaches with NF. Target output for this sprint: design doc for how to implement datasets across MC2 and NF (much of this captured in linked ticket above), including tentative annotation process. (will this leverage schematic or the Synapse API?) Additional info on Synapse Datasets: https://help.synapse.org/docs/Datasets.2611281979.html |
Goal is to create a design document for how MC2 and NF will want to handle this issue. Example design doc: https://docs.google.com/document/d/1dF1-FjGSdO3nkKArEsrnjnWFLeOV78MlvGZvM8smJVk/edit?pli=1#heading=h.47emx3tcx2wj |
24-9 Close-Out: Currently working with DM group (at Sage) to create a org-wide Dataset schema. This may take some time to reach consensus, but we can prioritize incorporating a placeholder model that we can then add the finalized schema later. Check on this mid sprint in 24-10. |
@aditya-nath-sage let's touch base on this during our check-in tomorrow! |
Aditya and Orion to meet to align on this. In the meantime, @aditya-nath-sage to review ongoing design doc Establish end of year goal for this effort |
24-10: Orion has a rough script on how to bind entities in Synapse. Need to understand how the schematic outputs will work here and what the schema looks like. |
24-11/12 Scope: Start working on this. This about how we surface datasets and collections that are on Synapse, and how these connect to publications via queryable metadata. Good to take stock of how many Datasets exist currently. Goal for end of sprint is a prelim design doc. Another thought: for the record based datasets we have, how can we maybe generate and surface a collection of related datasets. SOme limitations here, as Synapse Collections currently only consolidate Dataset entities. One possibility is to generate entities from records, and create a Dataset from these, then create a Collection. |
Rough draft of a schema bind script: https://github.com/mc2-center/mc2-center-dcc/blob/add-utils-11-24/utils/synapse_json_schema_bind.py Rough draft of a script to convert Synapse table info to annotations: https://github.com/mc2-center/mc2-center-dcc/blob/add-utils-11-24/utils/table_to_annotations.py Script for creating a Synapse Dataset and adding entities from a folder: https://github.com/mc2-center/mc2-center-dcc/blob/add-utils-11-24/utils/build_datasets.py
|
Emerges from exploratory and feasibility analysis in: mc2-center/mc2-center-dcc#71
This ticket should track efforts to develop and implement a schema for annotating Synapse Datasets curated as part of the proposed MC2 Center workflow (Note that this is different from the existing 'Datasets' component in the MC2 Center data model). This is the first of several steps, which may be tracked in separate tickets as this work progress:
The text was updated successfully, but these errors were encountered: