Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Notebooks to MyST standard #1439

Open
Zeitsperre opened this issue Jul 26, 2023 · 7 comments
Open

Convert Notebooks to MyST standard #1439

Zeitsperre opened this issue Jul 26, 2023 · 7 comments
Labels
docs Improvements to documenation help wanted Extra attention is needed

Comments

@Zeitsperre
Copy link
Collaborator

Problem

The tooling ecosystem around our Jupyter Notebooks is a bit much:

Our current examples are written using traditional Jupyter Notebooks (*ipynb) Editing these notebooks requires access to a Jupyter* instance, and reviewing them requires the NB Viewer GitHub hook. Running these notebooks (e.g. performing updates) requires us to generate lots of metadata that subsequently must be ripped out via a pre-commit hook (nbstripout).

It would be nice to simplify this, at least for the notebooks that support the documentation and almost never change.

Potential Solution

MyST Notebooks are written using Markdown (much easier to edit via text editor), compatible with Jupyter* instances (https://ebp.jupyterbook.org/en/latest/blog/2023-06-27-jupyterlab-myst/), and support caching of cell outputs (https://myst-nb.readthedocs.io/en/v0.13.2/use/execute.html#execute-and-cache-your-content).

We don't even have to do this with all notebooks immediately, as we can progressively convert them and support a mix of notebook types in the intermediary (https://docs.readthedocs.io/en/stable/guides/migrate-rest-myst.html#how-to-migrate-from-restructuredtext-to-myst-markdown).

@Zeitsperre Zeitsperre added the docs Improvements to documenation label Jul 26, 2023
@Zeitsperre
Copy link
Collaborator Author

@huard

You've had some experience with MyST and have talked about it previously. Are there reasons why we should/shouldn't go down this road?

@Zeitsperre Zeitsperre mentioned this issue Sep 8, 2023
2 tasks
@Zeitsperre Zeitsperre added the help wanted Extra attention is needed label Oct 10, 2023
@huard
Copy link
Collaborator

huard commented Nov 14, 2023

We may need to think on how to maintain the "integration test" aspect of our notebooks if we don't store the cell outputs anymore.

@huard
Copy link
Collaborator

huard commented Nov 14, 2023

@coxipi
Copy link
Contributor

coxipi commented Nov 16, 2023

As a temporary solution for git conflicts, maybe nbdev is worth exploring : https://nbdev.fast.ai/tutorials/git_friendly_jupyter.html

I think it's nice that when writing a notebook, we don't have to develop tests, they're automatically included.

We may need to think on how to maintain the "integration test" aspect of our notebooks if we don't store the cell outputs anymore.

I'm confused, would "support caching of cell outputs" as @Zeitsperre not keep the integration test? I don't see why we would cache the cell outputs if it's not for testing, that's why I'm not sure I understand.

Also, quarto might be also interesting to consider (https://quarto.org/docs/tools/jupyter-lab.html) if we run in too many issues with MyST (not sure it could solve anything more than MyST, just throwing the idea)

@Zeitsperre
Copy link
Collaborator Author

We may need to think on how to maintain the "integration test" aspect of our notebooks if we don't store the cell outputs anymore.

I'm confused, would "support caching of cell outputs" as @Zeitsperre not keep the integration test? I don't see why we would cache the cell outputs if it's not for testing, that's why I'm not sure I understand.

Also, a bit confused here. We don't currently cache any notebook outputs (nbstripout removes everything), so CI tests run on the notebooks solely check whether cells are running and not failing or returning stderr messages. I don't think this will impact us here (in other projects, absolutely it will).

@coxipi
Copy link
Contributor

coxipi commented Nov 16, 2023

Oh, I thought that the content of notebooks was tested too. Thinking of it, it would be a mess if the output results are not hard-coded somewhere like we do with the test suite in xclim. Are there projects (not xclim) where this is done?

@Zeitsperre
Copy link
Collaborator Author

The fact that we hardcode outputs in several projects' notebooks is in fact a major issue when running notebook checks, especially on PAVICS; In many cases, the dependencies in a given PAVICS service will be different from what is used to develop them, so we often get into situations where the outputs are virtually identical in both environments, but the order in which it is presented might change. This causes build-breaking messages.

It's something we're trying to address now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Improvements to documenation help wanted Extra attention is needed
Projects
Development

No branches or pull requests

3 participants