Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specify formats for data adhering to CRDC-H model #28

Closed
balhoff opened this issue Apr 6, 2021 · 5 comments
Closed

specify formats for data adhering to CRDC-H model #28

balhoff opened this issue Apr 6, 2021 · 5 comments

Comments

@balhoff
Copy link

balhoff commented Apr 6, 2021

This issue can be closed after creating some documentation on how to create structured formats which can be validated with CRDC-H.

  • JSON (definitely)
  • TSV - would be nice to define a standard way to use TSV with CRDC-H
  • PFB - can there be a defined way to use PFB with CRDC-H

Examples should go in the repository created for #27.

This issue is covered by two issues:

@balhoff
Copy link
Author

balhoff commented Apr 6, 2021

See also #15. It may turn out these issues are the same, but for now this one is specifically about CRDC-H and linkml.

@balhoff balhoff added this to the Phase 2 pilot milestone Apr 6, 2021
@gaurav
Copy link

gaurav commented Apr 22, 2021

GDC supports exporting query results as XML and TSV. I think the JSON representation is easier to read, but it's useful to know that these options exist.

PDC does not appear to support this, but it does allow lists of results to be exported as TSV/CSV on their website, so there's probably an API endpoint I haven't found yet to do this.

@balhoff
Copy link
Author

balhoff commented Jun 7, 2021

A JSON format is produced in https://github.com/cancerDHC/example-data. We have some more work to do formalizing the format possibilities in Phase 3.

@gaurav
Copy link

gaurav commented Aug 16, 2021

There is a LinkML-CSV package in development for converting instance data from a LinkML model in and out of CSV format, which might be really useful here.

@gaurav
Copy link

gaurav commented Oct 1, 2021

For our immediate needs, the LinkML instance format (in YAML) seems to be a good representation, and has been developed into some exemplars by the CCDH Data Model Harmonization team as part of the CCDH Pilot. Future formats will probably be supported by adding generators to LinkML, so that data from any LinkML model can be converted into that format (e.g. see the issue tracking an Avro/PFB generator for LinkML). Given that, I think we can close this issue until specific use-cases emerge from CDA and the CRDC nodes.

@gaurav gaurav closed this as completed Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants