Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebook header #4

Conversation

gabalafou
Copy link

@gabalafou gabalafou commented Dec 20, 2024

Note

This is a PR on top of a PR, PyWavelets#741.

This PR includes the following changes to the "regression" docs (doc/source/regression):

  • Header: Puts JupyterLite and download links above each doc
  • Notebook cell styling: Restyles notebook cells that have been rendered to HTML with MyST-NB
  • Handcoded output: Introduces workaround to control how notebook cell outputs look in the rendered doc page (see doc/source/regression/README.md for detailed explanation)
  • Corrections and edits: Make some changes to the text and reverts some changes that were made (see inline code comments)

Header

  • Shared header file (regression/header.md)
  • Each relevant doc file includes the shared header file via the include directive
  • The header file contains substitutions for the parent document name (the document that includes the header) that look like {{ parent_docname }}
  • These substitutions are replaced with the name of the parent document by a callback for the Sphinx "include-read" event

screenshot

Notebook cell styling

I started with Melissa's suggested style changes, as shown in the following screenshot:

This design is good for accessibility because it labels the input cells with "In" and the output cells with "Out".

I made a few small tweaks to Melissa's design - changed the left border color, decreased the font weight, left-aligned the labels with the code, made the labels look more like the code block captions in the PyData theme. Here's how it looks in light mode:

Here's how it looks in dark mode:

@gabalafou gabalafou force-pushed the notebook-header-directive branch from b5fc17e to c70dece Compare December 27, 2024 22:29
@gabalafou gabalafou changed the title Notebook header directive Notebook header Dec 27, 2024
@gabalafou gabalafou force-pushed the notebook-header-directive branch 5 times, most recently from b1ffb93 to 874090c Compare December 27, 2024 23:17
@agriyakhetarpal agriyakhetarpal force-pushed the make-usage-examples-section-interactive branch from fd3e01c to b3cdd4c Compare December 30, 2024 16:23
@gabalafou gabalafou force-pushed the notebook-header-directive branch from 874090c to 07f381f Compare December 30, 2024 17:31
Copy link
Owner

@agriyakhetarpal agriyakhetarpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, @gabalafou! Here is an initial review – I haven't taken a look at the CSS changes yet, also, I trust your judgement with them.

doc/source/_static/myst-nb.css Outdated Show resolved Hide resolved
Comment on lines 45 to 51

# In .md to .ipynd conversion, do not include any cells that have the
# jupyterlite_sphinx_strip tag
nb.cells = [
cell for cell in nb.cells if "jupyterlite_sphinx_strip" not in cell.metadata.get("tags", [])
]

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we can drop the entire functionality and use jupyterlite-sphinx to do this natively; should I add those changes here for you to rebase your branch over them, or do you wish to do so yourself?

Copy link
Author

@gabalafou gabalafou Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, at what point does the JupyterLite Sphinx extension strip the tagged cells? Does it happen at render time in the Lite app or does it happen when converting .md to .ipynb?

I think that the handcoded outputs should be stripped in either case, whether loading the notebook in Lite or downloading it as a .ipynb file.

It is perhaps a mistake, though, to use the jupyterlite_sphinx_strip tag here to remove the cell during markdown to ipynb conversion. Perhaps I should do a repo-wide search of "jupyterlite_sphinx_strip" and replace it with something like "pywt-remove-from-ipynb"... what do you think?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, at what point does the JupyterLite Sphinx extension strip the tagged cells? Does it happen at render time in the Lite app or does it happen when converting .md to .ipynb?

It happens on neither of those – the stripping does happen at build time, but after we have converted .md to .ipynb. Essentially, we convert a notebook first (if we need to), strip it, and then pass it along for JupyterLite to render.

It is perhaps a mistake, though, to use the jupyterlite_sphinx_strip tag here to remove the cell during markdown to ipynb conversion. Perhaps I should do a repo-wide search of "jupyterlite_sphinx_strip" and replace it with something like "pywt-remove-from-ipynb"... what do you think?

I think it should be fine to stay with jupyterlite_sphinx_strip, since we indeed designed it for this highly-specific use case in mind. Is there something different you had in mind, or maybe I misunderstood something?

That said, if we were to use jupyterlite-sphinx here, we would need to think about how to get the IPyNB file for downloads. The reason why it works as of now is because we convert the .md file to .ipynb in place, and store it in the same folder, but jupyterlite-sphinx does not do so and it directly stores the converted notebook to the _contents/ directory, which means that getting its location will be slightly tricky. We would need to get the "Download" button in the JupyterLite interface itself and it won't be available in Sphinx, which I implemented via the overrides.json file and the "Download button" JupyterLab extension (please see scipy/scipy#22161 for an example). How should we go ahead with this?

@gabalafou
Copy link
Author

Thanks for the review comments @agriyakhetarpal. I have responded to them.

I am taking this out of draft mode because I think it is now ready for review.

I decided not to split these changes up into separate PRs because I did not want to deal with merge conflicts.

Copy link
Author

@gabalafou gabalafou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self review, see inline comments added.

One thing I want to raise for review is that while comparing the docs changed in this PR with their published versions, I noticed some differences in some of the outputs. And I'm not just talking about the tracebacks, which this PR addresses.

For example if you scroll to the bottom of "2D Wavelet Packets" (wp2d), the numbers in the output four-by-four array are all very small, close to zero, printed in scientific notation with 9 significant digits:

screenshot

whereas at https://pywavelets.readthedocs.io/en/latest/regression/wp2d.html there are all printed as 0 with a dot:

screenshot

I noticed this also in a few other places.

doc/source/_static/myst-nb.css Outdated Show resolved Hide resolved
border-top-right-radius: 0;
}

div.pywt-handcoded-cell-output pre {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure if all of these selectors need to have div. In some cases, it was needed for the selector to achieve higher specificity in order to override other stylesheets (e.g., css files imported by MyST-NB, Sphinx, PyData Sphinx Theme), and I didn't want to resort to !important unless I absolutely had to. So then, just for simplicity and consistency, I used div plus the classname in all places.

Do you think that's worth commenting in the css file?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we can leave a short comment (TIL that the !important property is not recommended this way), that would be helpful when we port the same changes to MyST-NB/PST (or SciPy, before that) later.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that the all of these various Sphinx extensions should look into is using @layer but that's a much larger project and maybe doesn't provide sufficient value.

I have gone through and completely restructured the CSS, only using the element selector where necessary and commenting those places.

nb_execution_mode = 'auto'
nb_execution_timeout = 60
nb_execution_allow_errors = False

nb_execution_raise_on_error = True
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important if we want the regression docs to also serve as some kind of regression test. It won't compare with expected outputs, but it will fail the docs build if any of the inputs raise exceptions that are not tagged with raise-exception.

This is something that we should flag upwards to make sure that everyone understands that we are reducing the power of these regression tests. We need to make sure that everyone is okay with that.

doc/source/regression/README.md Show resolved Hide resolved
Comment on lines -13 to -14
execution_allow_errors: true
execution_show_tb: true
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execution_allow_errors is already set globally in conf.py so it does not need to be set here.

excution_show_tb seems like something that should be set globally, so if we want it, we should put it in conf.py.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, yes, we should check what the behaviour of execution_show_tb is, based on the difference in the tracebacks when it is enabled.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I leave that to you? I really have no idea what the execution_show_tb setting does.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry, I meant that I'll handle it, no worries!

# DWT and IDWT

## Discrete Wavelet Transform

Let's do a Discrete Wavelet Transform of some sample data `x`
Let's do a [Discrete Wavelet Transform](ref-dwt) of some sample data `x`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some places I restored the cross-references even though they won't work in the notebook file because I think they were valuable enough to keep in the docs page even if they don't work in the notebook file. I tried to do this sparingly.

Honestly, I was a bit surprised to see that this doesn't work. It really seems that one of the tools that converts to ipynb should support this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really seems that one of the tools that converts to ipynb should support this.

I also had the exact same thought over the holidays. I think supporting these kinds of references might be possible, since they are widespread in SciPy's notebooks – where we would remove the references that are caught with a regex pattern and drop HTML links to them if they are found. I shall take a look at it.

doc/source/regression/wavelet.md Show resolved Hide resolved
@@ -44,7 +24,7 @@ kernelspec:
import pywt
```

This helper function that can format arrays in a consistent manner across
This helper function can format arrays in a consistent manner across
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper function, format_array(), is strange because in both documents where it is defined I never actually see it called. Do you know what that's about?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice It is also defined in a different way in a different notebook:

def format_array(arr):
    return "[%s]" % ", ".join(["%.14f" % x for x in arr])

and it is called in places like this: >>> print(format_array(wavelet.rec_lo), format_array(wavelet.rec_hi)) or this:

for n in wp.get_leaf_nodes():
    print(n.path, format_array(n.data))

but it is legacy code after all, so we should be able to drop it. I had kept it solely because it was not directly related to the rendering of the notebooks, but cleanups are always good to do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just odd that two of the regression docs start with defining this function but then neither of them call it. It made me wonder if there was some side effect happening, which is why I didn't edit it out.

doc/source/regression/wp.md Show resolved Hide resolved
doc/source/regression/wp.md Show resolved Hide resolved
@gabalafou gabalafou marked this pull request as ready for review January 1, 2025 00:43
Copy link
Owner

@agriyakhetarpal agriyakhetarpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @gabalafou! The notebook styling looks great! I do have a few comments about those changes:

The image shows a Python code snippet and its output. The code uses PyWavelets (pywt) to perform a discrete wavelet transform with the 'db2' wavelet on input data 'x'. The output is a numpy array containing 5 floating-point coefficients: [1.76776695, 1.73309178, 3.40612438, 6.32928585, 7.77817459].

I think the "In:" and "Out:" might be a bit too close to the border of the enclosed box, and they are not aligned with the code blocks inside it. Would moving them slightly to the right (and the code blocks text somewhat to the left) make sense, i.e., something like this:

The image shows a code cell with 'In' positioned at the top with equal left margin spacing as the code below it, showing the PyWavelets idwt function call with cA, cD, and 'db2' parameters.

There is enough breathing space here for both the "In" text (I added 0.25em left padding) and the contents of the cell, and both are aligned vertically (minus the box that now overlaps, my styling is a bit shoddy here)

Also, I feel that the use of the colon feels slightly out of place, since I've seen that it is used with long sentences and not with headings, so just "In" and "Out" might look cleaner:

The image shows the detail coefficients (cD) output from the previous wavelet transform - a numpy array containing 5 negative and positive floating-point values: [-0.61237244, -2.15599552, -5.95034847, -1.21545369, 1.22474487]. The image has simplified headers showing just "In" and "Out" without colons.

The above picture is in dark mode, where it is slightly difficult for me to interpret which cell is which, i.e., it could be nice if we could have a separate background colour for the output cell in comparison to the code cell? Here's an example from the scikit-learn documentation: https://scikit-learn.org/stable/auto_examples/applications/plot_face_recognition.html#sphx-glr-auto-examples-applications-plot-face-recognition-py, where the output has a pale yellow shade with styles provided by Sphinx-Gallery.

I would note that the Sphinx-Gallery implementation does not have an "In" text, and it also has a colon symbol (in "Out:") – but for some reason that I find hard to describe, it feels less "intrusive" to me (maybe because it's in slightly gray and is not in bold?)

Happy to hear your thoughts on this! I'll approve this PR once we are on agreement with everything, the rest of the changes look nice already.

doc/source/regression/dwt-idwt.md Show resolved Hide resolved
This page can also be run or downloaded as a Jupyter notebook.

- Run in a new tab using JupyterLite[^what-is-lite]:
```{jupyterlite} {{ parent_docname }}.ipynb
Copy link
Owner

@agriyakhetarpal agriyakhetarpal Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```{jupyterlite} {{ parent_docname }}.ipynb
```{notebooklite} {{ parent_docname }}.ipynb

With jupyterlite-sphinx 0.17.1, we can now use the Notebook interface instead of the Lab interface, which is much nicer for displaying an individual notebook (as noted in scipy/scipy#22161).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice! That was released two weeks ago, about when I started working on this. I remember trying the notebooklite directive first but it wasn't working locally, probably because I had an older version of jupyterlite-sphinx installed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update all of the directives

@agriyakhetarpal
Copy link
Owner

For example if you scroll to the bottom of "2D Wavelet Packets" (wp2d), the numbers in the output four-by-four array are all very small, close to zero, printed in scientific notation with 9 significant digits:

whereas at pywavelets.readthedocs.io/en/latest/regression/wp2d.html there are all printed as 0 with a dot:

I noticed this also in a few other places

Ah, I just saw this comment, I missed it over the rest of them (sorry!)

I think we should stick with the new results, since they depict what the current state of the computation is. The page outside this PR is very old at this point...

@gabalafou gabalafou force-pushed the notebook-header-directive branch from 5d57ec6 to 3402f88 Compare January 2, 2025 22:23
@gabalafou
Copy link
Author

gabalafou commented Jan 2, 2025

Thanks, @agriyakhetarpal for the feedback! I pushed a few commits and left some replies to address your feedback.

Let's go over your styling suggestions one by one:

  1. ✅ Aligning the labels ("In", "Out") with the subsequent text
    • Yeah... I think you're right. Every visual design how-to I've ever read stresses the importance of alignment.
  2. ✅ Use of colon with "In", "Out"
    • I was on the fence about this when I was making it, so I say let's follow your suggestion for now and find out if anyone has a strong opposite opinion.
  3. ✅ Bold font for "In", "Out"
    • I also felt it was too bold. Since you felt the same way, I have removed the bold font rule.
  4. ❌ Remove the "In" label
    • In my first iteration, I set up the CSS so that it only applied the "In" label to notebook code cells if they had a corresponding output: no output, no label. But then I wondered if it might be useful to distinguish notebook code cells from markdown code fences. In that case the only distinguishing features (in this design) for a notebook code cell would be the left border and the "In" label, so I decided to leave it the label in place, even for inputs without corresponding outputs. Furthermore, nobody objected to the input label in the mockup that Melissa prepared, so I say let's leave it for now and see what the others think.
  5. ❌ Change the background color of the output box
    • My intent with the styles in this PR was to make the pages better match the PyData theme—for example, Monospace Blocks. The PyData theme does not yet have a full opinion on how to style and render notebook cells, and I am hesitant to create anything in this PR that doesn't re-use existing building blocks.
    • It was a little unclear to me from your feedback whether you found it difficult to distinguish output from input in either dark or light mode, or if there was something particular about dark mode that made distinguishing the two harder. If you meant that it's actually harder for you in dark mode, I'm very curious to figure out why.

To summarize, I would say that I share your feeling that the visual presentation doesn't feel quite right yet. But at this point, I would like to get something in front of more people and then iterate from there rather than to continue to iterate here in this PR. How does that sound?

@gabalafou
Copy link
Author

I think we should stick with the new results, since they depict what the current state of the computation is. The page outside this PR is very old at this point...

Hmm, but when I run that example in JupyterLite, I get the same result as the currently published docs:

print(w.d) outputs a four by four matrix whose elements are all floating point 0 (that is, shown as zero with a dot)

So it seems like something is going wrong when MyST-NB runs the cell or formats the output of the cell. (I think MyST-NB uses nbconvert internally.)

@gabalafou
Copy link
Author

gabalafou commented Jan 2, 2025

Updated the screenshots in the PR description

Copy link
Owner

@agriyakhetarpal agriyakhetarpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me now. Thanks for all the changes here, @gabalafou!

Re: points 4 and 5, yes, it makes sense to iterate on this when it is standardised in PST and remove these styles here, as it doesn't make sense to repeat work in two places. I'll merge this now to get my upstream PR going.

There is just this comment about the notebook conversion that I have concerns about: #4 (comment), but maybe it makes sense to devise a solution for it at a later point in time, as it is just PyWavelets and SciPy that will use this functionality.

@agriyakhetarpal agriyakhetarpal merged commit a8f287c into agriyakhetarpal:make-usage-examples-section-interactive Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants