Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly handle case where "See Also" section follows "Examples" in global_enable_try_examples #251

Conversation

agriyakhetarpal
Copy link
Member

Description

This PR is a bug fix for a problem I noticed in sympy/sympy#27419. SymPy uses a vendored version of numpydoc that was added twelve years back based on https://github.com/numpy/numpydoc; both have diverged in implementation and complexity since then, and SymPy's numpydoc doesn't enforce a particular order for each section, unlike the current numpydoc, which can automatically order/reorder sections. This causes the following behaviour:

Tap to show "Before"
The image shows the documentation page for SymPy 1.14.dev's number theory module, specifically documenting the sympy.ntheory.factor_.multiplicity(p, n) function. The page features a green snake-wrapped-S logo and a dark green sidebar with navigation sections. The main content includes function documentation and an interactive JupyterLite notebook example showing function usage. Notably, the notebook contains an extra erroneous or rather misplaced 'seealso' section with :obj:~.trailing'` appended at the bottom, which appears to be a documentation artifact that shouldn't be part of the interactive example. The page also includes proper 'See also' references at the top linking to 'factorint' and 'smoothness' functions, and a right sidebar showing the 'Ntheory Class Reference' and related functions.

where the See Also section percolates into the notebook. With this fix in place applied, we have

Tap to show "After"
The image shows the documentation page for SymPy 1.14.dev's number theory module, specifically documenting the sympy.ntheory.factor_.multiplicity(p, n) function. The page features a green snake-wrapped-S logo and a dark green sidebar with navigation sections. The main content includes function documentation and an interactive JupyterLite notebook example showing function usage. Unlike the previous version noted above, this notebook correctly does not contain the erroneous 'seealso' section at the bottom, showing the proper behaviour. The page maintains its standard 'See also' references at the top linking to 'factorint' and 'smoothness' functions, and includes a right sidebar showing the 'Ntheory Class Reference' and related functions.

where the section no longer appears in the notebook, and we stop at the end of the Examples section, as intended.

Additional context

It would be nice to update the SymPy documentation infrastructure to use the latest numpydoc so that the documentation is consistent with that for other packages. However, considering that its custom numpydoc has had multiple stylistic deviations by now, I've left it to be a follow-up task for me after the PR linked above gets merged. So, supporting SymPy in a special case in this manner doesn't sound too bad to me, considering that this change is trivial.

@agriyakhetarpal agriyakhetarpal added the bug Something isn't working label Jan 10, 2025
@agriyakhetarpal agriyakhetarpal mentioned this pull request Jan 10, 2025
3 tasks
@agriyakhetarpal agriyakhetarpal added this to the 0.18.0 milestone Jan 10, 2025
Comment on lines 318 to 320
# The See Also section shows up for old numpydoc versions where ordering
# is neither guaranteed nor enforced; such as SymPy, therefore we check
# for it as a special case.
Copy link
Collaborator

@steppi steppi Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"See Also" is included in _non_example_docstring_section_headers, so I don't think the issue can hinge just on the ability of "See Also" sections to appear after an "Examples" sections in older numpydocs versions. It looks like in SymPy, after numpydoc preprocessing, the line for the See Also section is ".. seealso::", but the existing logic expects it to look like ".. rubric:: See Also". I think the real issue is that the older numpydoc version uses a different format for the directives that get inserted during processing. It just so happens that SymPy has a "See Also" section after each "Examples" section, but I think any section appearing afterwards would fail to be detected.

If we want to support older numpydoc versions, then I think the principled solution would then be to either:

  1. Add the format used by older numpydoc versions to _next_section_pattern
    or
  2. Detect the numpydoc version at runtime, and have _next_section_start_pattern depend on that version

update: or it could even be that the directives even for newer numpydoc versions are different from what I expected, but I had it right for every section that can appear after "Examples" with newer numpydoc versions.

Copy link
Member Author

@agriyakhetarpal agriyakhetarpal Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just so happens that SymPy has a "See Also" section after each "Examples" section, but I think any section appearing afterwards would fail to be detected.

Ah, you're right – thanks for pointing that out! I think it would be nice to support older numpydoc versions in whatever manner we can, but I think option 2) is not possible with SymPy's numpydoc at the moment, as it is present as a custom Sphinx extension in docs/sphinxext/ across two files, and not as an installable package fetched from PyPI or a separate location that we can compare or detect versions for. So we are left with 1), which I've done here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the See Also section seems to be a case that has been handled specially: https://github.com/sympy/sympy/blob/3c33a8283cd02e2e012b57435bd6212ed915a0bc/doc/ext/docscrape_sphinx.py#L155-L161

And it doesn't seem to be the case for the "Notes", "Examples", or "References" sections: https://github.com/sympy/sympy/blob/3c33a8283cd02e2e012b57435bd6212ed915a0bc/doc/ext/docscrape.py#L424, which are not included in Try Ex

I haven't dived into the code in detail, but I infer that the difference is because the "See Also" section is an admonition in SymPy's numpydoc, and not a rubric. I'd suggest that we stick with the current approach for now, and add any other cases here if someone notices them, until I can migrate SymPy to the upstream numpydoc and remove their implementation (that is, I'm short of ideas, unless you have some).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is working, it's fine to do this hack to get SymPy working specifically, but the comment is incorrect and should be updated. It's not about the ordering but because of the form the directive takes in sympy's version of numpydocs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, makes sense to me. I hope 5f187f3 addresses it!

Comment on lines 318 to 320
# The See Also section shows up for old numpydoc versions such as that
# of SymPy where it is accounted for as an admonition and not a rubric,
# therefore we check for it as a special case for now.
Copy link
Collaborator

@steppi steppi Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is again not quite right. If you check line 311, it also checks for admonitions. The regex rf"\.\. (rubric|admonition)::\s*{header} will match .. followed by one of rubric or admonition, followed by any number of whitespace, followed by one of the headers in _non_example_docstring_headers.

But in at least SymPy, the See Also section doesn't get a directive like .. admonition:: See Also but just .. seealso:. What I'm curious about is, for current versions of numpydocs, does a See Also section get the directive .. admonition:: See Also like I assumed, or was I incorrect about this?

Since the above logic does check for See Also sections, it can't be relevant that the See Also section shows up for old numpydoc versions.

Suggested change
# The See Also section shows up for old numpydoc versions such as that
# of SymPy where it is accounted for as an admonition and not a rubric,
# therefore we check for it as a special case for now.
# SymPy uses a custom numpydoc version which has a different pattern for
# the generated directives. Rather than investigating further, take advantage
# of See Also being the only section appearing after Examples in SymPy and
# hardcode the directive that appears in SymPy. This can be removed after
# SymPy is updated to use regular numpydoc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point someone should sit down and figure out what the generated directives can actually look like for each section header in mainline numpydoc.

Copy link
Member Author

@agriyakhetarpal agriyakhetarpal Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time debugging this and while I couldn't get a definite answer about what happens where in the code, here's what I have realised –

There are two things at play here:

  1. There is a a .. seealso:: directive that's native to Sphinx, and a "See Also" section that upstream numpydoc can interpret and convert to a rubric/admonition, which we correctly parse and escape with TryExamples' _next_section_pattern if we retain compatibility with upstream numpydoc.
  2. However, SymPy's numpydoc is dated and can't do this for SymPy's documentation, and thus we don't get to catch it.

Therefore, a "See Also" section in the documentation for SciPy for an example, will caught in numpydoc, and this problem isn't specific to just .. seealso::, all native Sphinx-based directives won't get caught. It's just that this hasn't been reported as a bug before because of the following reasons:

A. most projects don't put anything after "Examples", and only SymPy that we know of (so far) has done so
B. Upstream numpydoc does indeed reorder the sections to some extent (which was my original comment a few commits ago, so I was right). Even if a project does add a section after "Examples", it gets caught by the pattern or re-ordered before it's rendered in the HTML.

As a confirmation for point A, I tried another Sphinx directive from the page I linked above, such as .. versionadded::, and here we go:

SymPy documentation screenshot with erroneous versionadded directive
image

The same experiment with NumPy confirms point B. For example, the docstring below:

NumPy's exceptions.AxisError docstring (modified)


"""
    Examples
    --------
    >>> import numpy as np
    >>> array_1d = np.arange(10)
    >>> np.cumsum(array_1d, axis=1)
    Traceback (most recent call last):
      ...
    numpy.exceptions.AxisError: axis 1 is out of bounds for array of dimension 1

    Negative axes are preserved:

    >>> np.cumsum(array_1d, axis=-2)
    Traceback (most recent call last):
      ...
    numpy.exceptions.AxisError: axis -2 is out of bounds for array of dimension 1

    The class constructor generally takes the axis and arrays'
    dimensionality as arguments:

    >>> print(np.exceptions.AxisError(2, 1, msg_prefix='error'))
    error: axis 2 is out of bounds for array of dimension 1

    Alternatively, a custom exception message can be passed:

    >>> print(np.exceptions.AxisError('Custom error message'))
    Custom error message

    See Also
    --------

    numpy.core._exceptions._AxisError : The internal version of this exception.

    Attributes
    ----------
    axis : int, optional
        The out of bounds axis or ``None`` if a custom exception
        message was provided. This should be the axis as passed by
        the user, before any normalization to resolve negative indices.

        .. versionadded:: 1.22
    ndim : int, optional
        The number of array dimensions or ``None`` if a custom exception
        message was provided.

        .. versionadded:: 1.22
"""

where the "See Also" section is after the "Examples" section – where we could assume that it would be ordered in the same manner. However, generating the NumPy docs, it shows up before the examples:

Screenshot 1 from NumPy documentation

image

But, if you were to add the .. versionadded:: directive or similar ones for NumPy, they are not reordered. I added a .. versionchanged:: Agriya debugging directive after the Examples section, and this is what we get:

NumPy documentation screenshots 2 and 3 with a native Sphinx-based directive after the Examples section


The HTML

image

and the interactive notebook

image

Copy link
Member Author

@agriyakhetarpal agriyakhetarpal Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to add almost all of the directives listed under https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#paragraph-level-markup to be able to restrict them from percolating into the notebooks; I don't think it makes sense to leave it with just .. seealso especially if others like ..versionadded are also included with both SymPy's numpydoc and the upstream numpydoc.

I used this command

from docutils.parsers.rst import directives
all_directives = list(directives._directive_registry.keys())

to get a list:

['attention',
 'caution',
 'code',
 'danger',
 'error',
 'important',
 'note',
 'tip',
 'hint',
 'warning',
 'admonition',
 'sidebar',
 'topic',
 'line-block',
 'parsed-literal',
 'math',
 'rubric',
 'epigraph',
 'highlights',
 'pull-quote',
 'compound',
 'container',
 'table',
 'csv-table',
 'list-table',
 'image',
 'figure',
 'contents',
 'sectnum',
 'header',
 'footer',
 'target-notes',
 'meta',
 'raw',
 'include',
 'replace',
 'unicode',
 'class',
 'role',
 'default-role',
 'title',
 'date',
 'restructuredtext-test-directive']

and we know some Sphinx-specific directives (outside docutils) manually that aren't present in this list:

['seealso',
 'versionadded',
 'versionchanged',
 'deprecated',
 'versionremoved',
 'index'
]

Only a subset of them will need to be added to the regex pattern, I assume?

(Edit: added a subset, please keep reading below)

Copy link
Member Author

@agriyakhetarpal agriyakhetarpal Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remembered this numpydoc PR from last year that was xref-ed from the NumPy PR where I had reported some incorrect ordering of the Attributes section, as it was showing up after the Examples section. It sheds some light on how the sections are ordered/reordered: numpy/numpydoc#571.

The order preservation is mandated here: https://github.com/numpy/numpydoc/blob/6d5eb428b8acea6ced4d86d086e36daf3f03ffbf/numpydoc/validate.py#L28-L41, and via an error message in https://github.com/numpy/numpydoc/blob/6d5eb428b8acea6ced4d86d086e36daf3f03ffbf/numpydoc/validate.py#L61. We know that dictionaries are ordered in Python post 3.7.

Comparing the relevant code with SymPy's numpydoc: https://github.com/sympy/sympy/blob/3c33a8283cd02e2e012b57435bd6212ed915a0bc/doc/ext/docscrape_sphinx.py#L208-L240, not only are the headings and sections different, but also that upstream numpydoc has a structural templating mechanism thingy where an order is enforced, and SymPy's version explicitly calls _str_see_also at the point where it wants the section to appear.

So, to answer your question (quoting it below) with all I've mentioned:

But in at least SymPy, the See Also section doesn't get a directive like .. admonition:: See Also but just .. seealso:. What I'm curious about is, for current versions of numpydocs, does a See Also section get the directive .. admonition:: See Also like I assumed, or was I incorrect about this?

Since the above logic does check for See Also sections, it can't be relevant that the See Also section shows up for old numpydoc versions.

If upstream numpydoc had not enforced the order, we would have also faced this SymPy-specific problem in NumPy. It is about the ordering of the sections rather than what the sections are themselves (whether Sphinx directives or named sections in numpydoc).

Upstream numpydoc also has https://github.com/numpy/numpydoc/blob/6d5eb428b8acea6ced4d86d086e36daf3f03ffbf/numpydoc/validate.py#L24-L27 to run checks version-changed directives like the above, but it does not enforce an order for them as they are not a part of standard sections but are directives.

So, I think we should indeed add at least a few of these directives to the patterns we need to ignore – maybe just the most commonly used ones from the above list will work. This would help in both getting SymPy to play nicely, and in the scenario that some docstrings are incorrectly added to any other projects that use the modern numpydoc, we don't let that mistake in with the interactive examples.

Copy link
Member Author

@agriyakhetarpal agriyakhetarpal Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the same time, if someone were to put a directive in the middle of a doctest, that could cause a bug report for us because the doctest would end midway... but I feel we're technically correct and we could say "works as intended" and close it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a minimal set of directives to halt at in commit 79e864e and this shouldn't cause any problematic behaviour with most websites, while also getting us working with SymPy.

Copy link
Collaborator

@steppi steppi Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused. The documentation for TryExamples says this

image

Can you end up with a raw Sphinx heading directive inside of the processed docstring when using numpydoc or sphinx.ext.napoleon when using them properly?

I still think if the current implementation only works because of the enforced ordering, then it's because it was only working incidentally because of the enforced ordering, because the code is trying to catch all possible next section directive patterns for numpydoc.

@Carreau
Copy link
Collaborator

Carreau commented Jan 13, 2025

Trying to catch up with a lot of tings, but one question I have is: is it worth doing just for sympy ? If there are different way of doing the same things, one of which is the large minority, isn't it better in the long run to update all places that use the least use way ?

I got that last year with many project using .. directive[space]:: (while usually there is not space); it was easier to update the few dozen numpy/scipy issues than support the presence of spaces.

Maybe spending the time now to use a newer version of numpydoc would be worth it ?

Comment on lines 316 to 326
"class",
"contents",
"epigraph",
"footer",
"header",
"highlights",
"parsed-literal",
"pull-quote",
"seealso",
"sidebar",
"topic",
Copy link
Collaborator

@steppi steppi Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these like "pull-quote" and "parsed-literal" seem like they would be reasonable to put in an examples section. Others would be very strange to put in the middle of an examples section when using numpydoc. I made it so that the next section had to be a numpydoc section header because I kept running into new directives that were used in the middle of examples sections. Rather than using other directives as a delimiter, what I did was map them to markdown so they appear in the notebooks (math, literal blocks, reference), or ignore them so they don't appear in notebooks (plot, only).

The intention was only to support numpydoc (and sphinx.ext.napoleon because it is compatible) and this is clearly documented. Are there reasonable uses of numpydoc that are not currently supported? I think it's pretty reasonable to say, "if you're using global_enable_try_examples, then an example section is until the next numpydoc section header or the end of the docstring, if you want to delimit the section differently, manually insert the try_examples directive". If there are directives which can be used examples section which aren't currently being handled, by either mapping them into the notebooks or ignoring them in the notebooks, then that is something that should be fixed, but perhaps effort should only be spent on any particular case if we see it in the wild.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, here's what I propose. ignore_directives should be changed to explicit_ignore_directives. If any directive appears in an examples section which is not in explicit_ignore_directives or not mapped to markdown in some way, it should be implicity ignored, but there should be an informative warning.

@agriyakhetarpal
Copy link
Member Author

Thanks @Carreau and @steppi for the reviews. I think we have two options:

A. I can fix this on SymPy's side by porting some rudimentary reordering to SymPy's numpydoc. It would be only for this case, so that the See Also section appears before the Examples and we don't change anything else – based on @Carreau's review in #251 (comment).
B. proceed with @steppi's idea of adding explicit_ignore_directives and implicit_ignore_directives in #251 (comment)

The first option has the benefit that we don't need to customise our code to support some other project that we have no control over, and the second option's benefit is that it might lead to fewer issues in the long run and we have the requisite structure for doing it already.

What should we opt for?

@steppi
Copy link
Collaborator

steppi commented Jan 13, 2025

Thanks @Carreau and @steppi for the reviews. I think we have two options:

A. I can fix this on SymPy's side by porting some rudimentary reordering to SymPy's numpydoc. It would be only for this case, so that the See Also section appears before the Examples and we don't change anything else – based on @Carreau's review in #251 (comment). B. proceed with @steppi's idea of adding explicit_ignore_directives and implicit_ignore_directives in #251 (comment)

The first option has the benefit that we don't need to customise our code to support some other project that we have no control over, and the second option's benefit is that it might lead to fewer issues in the long run and we have the requisite structure for doing it already.

What should we opt for?

I think both A. and B. should be done, but with different priority levels. A is the fix for the immediate issue with SymPy, but there being additional directives which could appear in notebooks shouldn't just be swept aside. It makes more sense to ignore them by default than just keep in the notebook, but in any case, I think this PR could be closed, and B. could be left for later.

@agriyakhetarpal
Copy link
Member Author

Albert and I met just some moments ago, and this is our resolution:

  • "See Also" heading is also a section that needs to be included – we weren't doing this before, and it is a valid bug as per our current documentation for TryExamples. NumPy, SciPy, and co. were saved only because modern numpydoc can reorder sections during processing. So, this isn't specific to SymPy at all, but was just caught in an extraordinary case because SymPy did not use modern numpydoc and can't reorder the sections.
  • More Sphinx directives, such as those for tables and for embedding images/figures can be added, similar to how we convert URLs from reST syntax to Markdown syntax.
  • The list of directives to halt/ignore will also be revised, and a warning will be displayed with an invitation to open a new issue if they feel a particular directive is appropriate to use within an "Examples" section, and we can decide ad-hoc on if we want to support that.

Items 2 and 3 are low-priority for now, and I'll open separate issues to track them.

@steppi
Copy link
Collaborator

steppi commented Jan 13, 2025

  • "See Also" heading is also a section that needs to be included – we weren't doing this before, and it is a valid bug as per our current documentation for TryExamples. NumPy, SciPy, and co. were saved only because modern numpydoc can reorder sections during processing. So, this isn't specific to SymPy at all, but was just caught in an extraordinary case because SymPy did not use modern numpydoc and can't reorder the sections.

I just want to clarify, that the existing code before this PR also considers "See Also" to be a section that should be handled, but it simply has the wrong pattern for matching a "See Also" section. I didn't realize that numpydoc maps a "See Also" section to a .. seealso: directive, and just didn't notice this because the order enforcement meant "See Also" couldn't appear after "Examples". The documentation says TryExamples should also support sphinx.ext.napoleon though, which I think may not have the same order enforcement.

@agriyakhetarpal
Copy link
Member Author

Yes, from a quick skim of the code, sphinx.ext.napoleon does not enforce a canonical order like numpydoc does, and rather, it preserves the order of a particular docstring – because it processes the sections in the order that they appear: https://github.com/sphinx-doc/sphinx/blob/e7dd42ea9110a2eeb8ebc10f4bf8be31f0700b61/sphinx/ext/napoleon/docstring.py#L373-L408

That does offer less consistency and more flexibility in comparison to numpydoc, though that's a topic for another discussion.

FWIW, Napoleon also uses the .. seealso:: directive and creates a section out of it: https://github.com/sphinx-doc/sphinx/blob/e7dd42ea9110a2eeb8ebc10f4bf8be31f0700b61/sphinx/ext/napoleon/docstring.py#L1035-L1036 via https://github.com/sphinx-doc/sphinx/blob/e7dd42ea9110a2eeb8ebc10f4bf8be31f0700b61/sphinx/ext/napoleon/docstring.py#L615-L623, so we should indeed halt at the .. seealso:: directive that has now been done in this PR; this change covers both sphinx.ext.napoleon and numpydoc.

Copy link
Collaborator

@steppi steppi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agriyakhetarpal. This looks good to me now. I can work on the follow up for handling other section headers which aren't mapped to admonition or rubric, and for ignoring other unhandled directives. I feel a sense of responsibility over these things since I was the one who missed them when implementing try_examples

@steppi steppi merged commit a908c14 into jupyterlite:main Jan 13, 2025
5 checks passed
@steppi steppi changed the title Add the "See Also" section to the TryExamples directive's trailing section ignore patterns Correctly handle case where "See Also" section follows "Examples" in `global_enable_try_examples" Jan 13, 2025
@steppi steppi changed the title Correctly handle case where "See Also" section follows "Examples" in `global_enable_try_examples" Correctly handle case where "See Also" section follows "Examples" in global_enable_try_examples Jan 13, 2025
@agriyakhetarpal agriyakhetarpal deleted the fix/don't-add-see-also-trailing-sections branch January 13, 2025 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants