Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up infrastructure for upload from issue template #1299

Closed
18 of 44 tasks
wd15 opened this issue Oct 7, 2021 · 2 comments
Closed
18 of 44 tasks

Set up infrastructure for upload from issue template #1299

wd15 opened this issue Oct 7, 2021 · 2 comments

Comments

@wd15
Copy link
Collaborator

wd15 commented Oct 7, 2021

Rough work plan:

  • Make python-pfhub read from list of meta.yaml's located at any location
    • ensure all the tests pass
    • check that docstrings are adequate for new functions
    • make sure simulation_list.yaml lists local files where possible
    • start a PR for these changes to make sure the tests all pass
  • set up issue template
  • make the upload issue open a github action that reads the data
  • clean up bug issue template to be appropriate (in docs: clean up issue tempalates #1331 awaiting approval)
  • make the issue submission github action open a PR with the issue data
  • ensure that only upload issues result in PRs
  • allow resubmission of issues
  • place message back into issue with link to PR (make sure that the PR has the link to the issue)
  • update list of simulations
  • clean up the upload issue to gather all the appropriate data, including the data locations (including some of the boxes in codemeta)
    • use envsubst
    • make implementation complete
    • in the envsubst step check that the envs are not required to simplify
    • timestamp from issue
    • name and email
    • ask about hardware
    • See Trevor comment below for what should be in included
    • make all dropdowns alphanumeric ordering
    • make the data section work
      • 2D, 3D, media boxes (key, value pairs as well, both options)
      • different template for each benchmark
    • add codemeta box
    • github actions for url data type checking (and other linting of data)
  • Set up a simple example repository and ensure uploads work correctly
  • Use codemeta.json from upload repository to get more data (see below)
  • FAIR check (Trevor can guide me or help)
    • use codemeta generator? Is this feasible in some way?
    • can we force codemeta on people as a requirement?
  • allow resubmission of issues
  • allow populating of issue from from existing meta.yaml
  • an issue template for each upload and corresponding meta.yaml.template

Old ideas:

  • Lint schema maybe, see this
  • ensure an action is initiated when pinged and the notebooks are rebuilt
  • harvest meta-data
  • generate an individual notebook for the simulation
  • run a FAIR check tool for each repository and include badges in generated notebook (or table)
  • run the FAIR checker on the example repository
  • work on a new standard for the meta data
  • git lfs prototype

Questions

  • Could the yaml file go anywhere, not just in a github repository? No, decided that it should go in phfub for now.
  • Should subsequent submissions for the same issue generate a new PR?

Schemas

These could either be used in the phhub store or in the upload repo. Still getting that straight.

Goal

Long term we want the upload form to ask as little as possible and infer as much as possible.

@tkphd
Copy link
Collaborator

tkphd commented May 6, 2022

Additional thoughts: CodeMeta is great! Here's its schema.

Each implementation repository should have a codemeta.json: generate one easily by filling in as much of the CodeMeta Generator as possible. By default, this populates author, license, and URL/DOI information. It does not capture framework information, though it could (using the schema, add a section to your JSON for runtimePlatform --- JVM or .NET is the example, but MOOSE or PRISMS-PF or FiPy would fit just as well; then add a downloadUrl and version field in the dict).

Is there an equivalent for data? ... or is that what we're going to discuss? Is there an implementation of Andrea Medina-Smith's Controlled Vocabulary and Metadata Schema for Materials Science Data Discovery?

Things we might want to collect:

  • URL of the implementation repository (GitHub or any web address)
    • Sub-directory containing codemeta.json, if not at the top level
    • Sub-directory for the implementation source code, if not at the top level
  • Name and version of the software framework, if not present in codemeta.json
  • URL for the data repository (any web address)
    • Sub-directory for data, if applicable

@wd15 wd15 changed the title Set up infrastructure for forking model repository for adding data Set up infrastructure for upload from issue template May 9, 2022
@wd15
Copy link
Collaborator Author

wd15 commented Jun 2, 2022

Closing this issue as we now have a project for this.

@wd15 wd15 closed this as completed Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants