Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the doctest module in get_example_data #308

Merged
merged 40 commits into from
Jan 10, 2024
Merged

Conversation

asmeurer
Copy link
Collaborator

@asmeurer asmeurer commented Oct 20, 2023

Fixes #282

Still several todos here:

  • Clean up code (probably don't need to define the class inside of the function)
  • Add tests
  • Check that the generated JSON is as desired (the execution status is not actually included. I'm not clear why)
  • Make it so that any doctest anywhere is run, not just ones in "Examples" (this doesn't necessarily need to be done in this PR)
  • Allow libraries to configure doctest options (like ELLIPSIS)

Here's an example:

def docstring(x):
    """
    Examples
    ========

    >>> from test_mod import docstring
    >>> a = docstring(1)
    >>> a
    2

    >>> 1 + a
    3

    >>> import matplotlib.pyplot as plt
    >>> plt.plot([0, 1], [0, 1])
    >>> plt.show()

    >>> 1 + 1
    2

    >>> syntax error

    >>> 1/0 # exception
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ZeroDivisionError: division by zero

    >>> 1/0 # unexpected exception
    """
    return x + 1

__version__ = '0'
[global]
module = 'test_mod'

Generates

  "example_section_data": {
    "children": [
      {
        "type": "code",
        "value": "from test_mod import docstring\n"
      },
      {
        "type": "code",
        "value": "a = docstring(1)\n"
      },
      {
        "type": "code",
        "value": "a\n"
      },
      {
        "type": "code",
        "value": "1 + a\n"
      },
      {
        "type": "code",
        "value": "import matplotlib.pyplot as plt\n"
      },
      {
        "type": "code",
        "value": "plt.plot([0, 1], [0, 1])\n"
      },
      {
        "type": "code",
        "value": "plt.show()\n"
      },
      {
        "type": "Fig",
        "value": {
          "kind": "assets",
          "module": "test_mod",
          "path": "fig-test_mod:docstring-0-c8430bd5.png",
          "type": "RefInfo",
          "version": "0"
        }
      },
      {
        "type": "code",
        "value": "1 + 1\n"
      },
      {
        "type": "code",
        "value": "syntax error\n"
      },
      {
        "type": "code",
        "value": "1/0 # exception\n"
      },
      {
        "type": "code",
        "value": "1/0 # unexpected exception\n"
      }
    ],

@asmeurer asmeurer marked this pull request as draft October 20, 2023 22:00
They aren't actually that important for doctests because they are only used
for the reporting, which we are bypassing anyways.
@asmeurer
Copy link
Collaborator Author

I think the main thing that needs to be done here is now is to take a look at the generated JSON and see if we like how it looks. It's not hard to change what is there.

Doesn't do anything right now because success/failure isn't saved
@Carreau
Copy link
Member

Carreau commented Nov 16, 2023

To do for me:

  • Check why the exec_status is in the JSON
  • Check wether the output of doctests is in the Json.
  • Add the tests examples.

@asmeurer asmeurer marked this pull request as ready for review November 16, 2023 17:20
@asmeurer
Copy link
Collaborator Author

You should also just review the code, and run this against the existing example configurations to make sure nothing funny is happening.

@Carreau
Copy link
Member

Carreau commented Nov 17, 2023

I pushed a commit that inject a debug function instead of lambda s: None,

It seem that some of the parsing is incorrect, as I get an

$ papyri gen examples/papyri.toml --only papyri.examples:example1
...
Unexpected exception (<class 'SyntaxError'>, SyntaxError('multiple statements found while compiling a single statement', ('<doctest example1[0]>', 1, 32, 'import matplotlib.pyplot as plt\n', 1, 32)), <traceback object at 0x1202d7a40>)

Note that this debug message make it looks like the exec(compile(..., 'single')) in doctest got a line with \n et the end, but it does get a multiple line.

I'm not sure why this is happening or why the code here is wrong. I'll investigate.

@Carreau
Copy link
Member

Carreau commented Nov 17, 2023

Ha, I think it considers ... as continuation always. So replacing ... with >>> in a couple of places works.

And that make me realize we should have a custom parser in IPython/testing/plugin/ipdoctest.py

@asmeurer
Copy link
Collaborator Author

asmeurer commented Dec 1, 2023

In the future, do not force push to other people's branches.

papyri/gen.py Show resolved Hide resolved
@asmeurer
Copy link
Collaborator Author

There seems to be a segfault from one of the plots in the np.sinc doctest

Fatal Python error: Segmentation fault

Current thread 0x00000002016d0240 (most recent call first):
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 43 in _wrapit
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 54 in _wrapfunc
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 2597 in cumsum
  File "<__array_function__ internals>", line 200 in cumsum
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/gridspec.py", line 193 in get_grid_positions
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/_api/deprecation.py", line 384 in wrapper
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/gridspec.py", line 665 in get_position
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/axes/_base.py", line 793 in set_subplotspec
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/axes/_base.py", line 661 in __init__
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/figure.py", line 757 in add_subplot
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/figure.py", line 1628 in gca
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/pyplot.py", line 2309 in gca
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/pyplot.py", line 3084 in title
  File "<doctest sinc[1]>", line 1 in <module>
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/doctest.py", line 1351 in __run
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/doctest.py", line 1498 in run
  File "/Users/aaronmeurer/Documents/papyri/papyri/gen.py", line 1335 in get_example_data
  File "/Users/aaronmeurer/Documents/papyri/papyri/gen.py", line 1655 in prepare_doc_for_one_object
  File "/Users/aaronmeurer/Documents/papyri/papyri/gen.py", line 2129 in collect_api_docs
  File "/Users/aaronmeurer/Documents/papyri/papyri/gen.py", line 558 in gen_main
  File "/Users/aaronmeurer/Documents/papyri/papyri/__init__.py", line 474 in gen
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/typer/main.py", line 683 in wrapper
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/click/core.py", line 760 in invoke
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/click/core.py", line 1404 in invoke
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/click/core.py", line 1657 in invoke
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/typer/core.py", line 216 in _main
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/typer/core.py", line 778 in main
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/click/core.py", line 1130 in __call__
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/typer/main.py", line 311 in __call__
  File "/Users/aaronmeurer/Documents/papyri/papyri/__main__.py", line 3 in <module>
  File "<frozen runpy>", line 88 in _run_code
  File "<frozen runpy>", line 198 in _run_module_as_main

Although there's a separate question which is why the doctests are being run at all with --no-exec.

@asmeurer
Copy link
Collaborator Author

I fixed the --no-exec flag. The segfault doesn't happen on main, though. I'm guessing it has something to do with with the fig managers.

@Carreau
Copy link
Member

Carreau commented Dec 16, 2023

I think the culprit of set_numeric_ops (I've pushed a commit that deactivate it), which replace addition with addition mod 5 globally. It might still be a bug but as it's deprecated maybe it's not worth our time tracking it down.

I've pushed a commit that exclude just this function from being executed.

I was also able to reproduce just with

papyri gen examples/numpy.toml --no-narrative --only numpy:set_numeric_ops --only numpy:sinc

papyri/gen.py Outdated
Comment on lines 1337 to 1339
doctests = doctest.DocTestParser().get_doctest(
block, doctest_runner.globs, obj.__name__, filename, lineno
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_doctests also completely drop interleaving text, so we'll need to fallback to parse(...) and create the DocTests object ourselves here.

@asmeurer
Copy link
Collaborator Author

asmeurer commented Jan 4, 2024

I've fixed the parser to be more robust for interleaving text. It now properly handles the case where text is right before an example. My main concerns now are if we are actually including everything we want in the output JSON.

In particular, the JSON output doesn't include the prompts (>>> and ...), and it doesn't include the outputs of the doctests. This is the case even when exec=false. We should presumably fix it to include the outputs, but note that this is also the case in main. For example, here's np.select in main:

  "example_section_data": {
    "children": [
      {
        "type": "code",
        "value": "x = np.arange(6)\ncondlist = [x<3, x>3]\nchoicelist = [x, x**2]\nnp.select(condlist, choicelist, 42)"
      },
      {
        "type": "code",
        "value": "condlist = [x<=4, x>3]\nchoicelist = [x, x**2]\nnp.select(condlist, choicelist, 55)"
      }
    ]

and in this branch

  "example_section_data": {
    "children": [
      {
        "type": "code",
        "value": "x = np.arange(6)\n"
      },
      {
        "type": "code",
        "value": "condlist = [x<3, x>3]\n"
      },
      {
        "type": "code",
        "value": "choicelist = [x, x**2]\n"
      },
      {
        "type": "code",
        "value": "np.select(condlist, choicelist, 42)\n"
      },
      {
        "type": "text",
        "value": "\n"
      },
      {
        "type": "code",
        "value": "condlist = [x<=4, x>3]\n"
      },
      {
        "type": "code",
        "value": "choicelist = [x, x**2]\n"
      },
      {
        "type": "code",
        "value": "np.select(condlist, choicelist, 55)\n"
      }
    ],

Compare the actual docstring:

    Examples
    --------
    >>> x = np.arange(6)
    >>> condlist = [x<3, x>3]
    >>> choicelist = [x, x**2]
    >>> np.select(condlist, choicelist, 42)
    array([ 0,  1,  2, 42, 16, 25])

    >>> condlist = [x<=4, x>3]
    >>> choicelist = [x, x**2]
    >>> np.select(condlist, choicelist, 55)
    array([ 0,  1,  2,  3,  4, 25])

Note the array([ 0, 1, 2, 3, 4, 25]) bits aren't included in the JSON anywhere.

papyri/gen.py Outdated
)
)
)
figs.extend(figs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
figs.extend(figs)
self.figs.extend(figs)

I think, or no figures will be saved I belive.

papyri/gen.py Outdated Show resolved Hide resolved
@@ -41,3 +41,4 @@ exclude = [ "dask.utils:Dispatch",

#docs_path = "~/dev/dask/docs/source"
exec_failure = 'fallback'
execute_doctests = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You introduced this but are not using it, I'm guessing you intended something to exec. I pushed a commit that rename all the usage of config.exec to config.execute_doctests for it to work as intended. You naming is better.

@Carreau
Copy link
Member

Carreau commented Jan 10, 2024

Ok, test are passing, let's merge and move on.

@Carreau Carreau merged commit 1861202 into jupyter:main Jan 10, 2024
12 checks passed
Carreau added a commit to Carreau/papyri that referenced this pull request Jan 11, 2024
This is incomplete but I'm going to try to deal with the following:

1) each line in a black example is after jupyter#308 it's own line, so try to
   collapse subsequent code blocks.

2) It seem that we get a number of report_failure, but failure is just
   when the output does not match, though when we have a block with
   multiple >>> in sequence and we ignore output on purpose they now are
   seen as failure.

It's not super great and will need a buch of workaround
Carreau added a commit to Carreau/papyri that referenced this pull request Jan 18, 2024
This is incomplete but I'm going to try to deal with the following:

1) each line in a black example is after jupyter#308 it's own line, so try to
   collapse subsequent code blocks.

2) It seem that we get a number of report_failure, but failure is just
   when the output does not match, though when we have a block with
   multiple >>> in sequence and we ignore output on purpose they now are
   seen as failure.

It's not super great and will need a buch of workaround
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rework example execution use doctests.
2 participants