Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add meca content provider #1335

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

stevejpurves
Copy link

@stevejpurves stevejpurves commented Feb 19, 2024

This PR adds a content provider for MECA Bundles.

MECA stands for Manuscript Exchange Common Approach, a NISO standard for transfer of manuscripts between manuscript systems.

The provider was written during in 2023 as part of the AGU's NotebooksNow project. The aim of that project is to make notebook based computational research papers a formal part of the scientific record, and has drawn on inputs from a number of working groups and stakeholder organisations.

MECA was identified as a way to consistently package files from a git repository, paired with JATS xml, that would enable a set of notebooks, code, REES environment specification, into an single file that could be accepted and archived in existing publishing systems.

Support to produce MECA files has been included in Myst Markdown and Quatro author toolchains, enabling researchers to export to this format and the first AGU journal supporting the format is going live in 2024.

By adding provider as a standard provider in repo2docker/binderhub, we'll enable authors and researchers to reproduce research articles that include this bundle.

This provider is intended to be paired with the BinderHub Repo Provider being added in this PR: jupyterhub/binderhub#1824

Operation

The provider receives a parsed url to a Meca bundle, which is a zip file. The zipfile is fetched to a local tmp folder and unpacked. Given a manifest.xml with expected tags to a article-source-directory, that identifies the path to use for docker creation. (we expect a REES compatible environment to be present. If this information is missing, we try to build from the root folder.

Copy link

welcome bot commented Feb 19, 2024

Thanks for submitting your first pull request! You are awesome! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

Copy link
Contributor

@sgaist sgaist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation functionally wise looks fine.

There are some minor suggestions.

I would also recommend to run pre-commit, it will cleanup the imports as well as some whites pace related things.

repo2docker/contentproviders/meca.py Outdated Show resolved Hide resolved
repo2docker/contentproviders/meca.py Outdated Show resolved Hide resolved
repo2docker/contentproviders/meca.py Outdated Show resolved Hide resolved
repo2docker/contentproviders/meca.py Outdated Show resolved Hide resolved
repo2docker/contentproviders/meca.py Outdated Show resolved Hide resolved
@stevejpurves
Copy link
Author

stevejpurves commented Mar 4, 2024

thanks @sgaist
I'm continuing to test locally this week and hope to pop this PR out of "draft" be the week end. I'm hitting some edge cases that I'm looking into:

  • url specs contain multiple query params, these can be stripped after the first parameter
  • the provider works when used in conjunction with binderhub but errors out when used from the command line

@yuvipanda
Copy link
Collaborator

Hi @stevejpurves :) I just merged another new provider (CKAN), so wanted to check in to see if there's anything I can do to help this move forward!

@stevejpurves stevejpurves marked this pull request as ready for review July 7, 2024 21:52
@stevejpurves stevejpurves marked this pull request as draft July 7, 2024 21:58
@stevejpurves
Copy link
Author

I've brought this up to date and also made some changes to allow a server to be launched from a local something.meca.zip file. Basically using a naming convention as way to initially detect a meca bundle specifically.

I guess that could be changed to check any local file for the expected contents, before returning true, but maybe this is fine for now.

I still intend to resolve:

  • url specs contain multiple query params, these can be stripped after the first parameter
  • the provider works when used in conjunction with binderhub but errors out when used from the command line

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants