-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API pages and inventory #1220
Comments
I am not opposed to having an API document. It would likely be helpful to extension developers. However, it should not replace the existing user documentation. I am a firm believer that user documentation should be write as prose, not fragmented code comments. Of course, developers also need the technical specs, which is where API docs can complement the prose (although, personally, I prefer browsing the source code). Therefore, if it is done correctly, the two can compliment each other. My thinking is that a new page ( In theory, any function/method/class which is not marked as private (using Python's convention of beginning the name with an underscore) is public (the one exception being double underscore methods which can be defined in a subclass). However, it is possible that a few "public" objects are not documented because the API for them was never stabilized. Others were never documented because they would only be needed for use cases which we have not documented (such as overriding the Finally, a minor nitpick. In the screenshot to provided, I can see the So, yes, if you are willing to do the work, I am willing to review it and accept it when we get something great. |
Thanks for your detailed answer 🙂
I could then start with a basic, recursive generation of the docs for each module (just as shown in the screenshot above), and you can tell me which member should be excluded.
Indeed, this has been in the backlog for quite some time. I'll have to think more about it. There are multiple ways to achieve that and I'd like to find the most elegant one.
This should already be possible 👍 I'll send a new screenshot.
Sure, I understand 🙂 This is valuable input, thanks! I'll keep you updated! P.S.: do you want to keep using pure Markdown docstrings, or could we consider other styles? I personally like the Google-style, and the underlying library, Griffe, is able to parse it so that mkdocstrings can output tables or lists. It's also what allows automatic cross-references to documented objects of the standard library or other third-party libs. Numpydoc and Sphinx-styles are also supported, though they are more based on rST than Markdown so I generally avoid them. |
That seems like a reasonable approach.
Could you provide links to and/or examples of what these look like and what they are used for? I'm assuming tables or lists would be of parameters accepted by a function/method, but that's not clear to me from your description. |
Of course. Here are some examples of Google-style docstrings in Napoleon's docs: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html
You guessed correctly. def say_hello_to(name: str = "world") -> str:
"""Say hello to the given person.
Parameters:
name: The person to say hello to.
Returns:
The hello message.
"""
return f"Hello {name}!" In the above example, the Parameters block will be parsed by the Google-parser as a I think the best way to see it in action it to look at Griffe's own code reference (Griffe is the underlying library used by mkdocstrings' Python handler to extract data from Python source and to parse docstrings), for example: https://mkdocstrings.github.io/griffe/reference/griffe/docstrings/utils/#griffe.docstrings.utils.warning *as long as you load their respective |
Some screenshots with a docstring written according to the Google-style: def __init__(self, **kwargs):
"""
Creates a new Markdown instance.
Other parameters: Keyword arguments:
extensions (list): A list of extensions.
If an item is an instance of a subclass of `markdown.extension.Extension`, the instance will be used
as-is. If an item is of type string, first an entry point will be loaded. If that fails, the string is
assumed to use Python dot notation (`path.to.module:ClassName`) to load a markdown.Extension subclass. If
no class is specified, then a `makeExtension` function is called within the specified module.
extension_configs (dict): Configuration settings for extensions.
output_format (str): Format of output. Supported formats are:
* "xhtml": Outputs XHTML style tags. Default.
* "html": Outputs HTML style tags.
tab_length (int): Length of tabs in the source. Default: 4
""" Table style: Maybe less disruptive - list style: List style with cross-refs to the standard lib (we load stdlib's objects.inv): Note that the type can usually be fetched from the signature, but here it's a keyword args section so it's not possible (we cannot annotate items of a dict distinctively). |
👍 I like it.
Yeah, I agree. Let's stick with that for now.
It could be very valuable for linking to our own internal types. However, I'm not sure we need this for the stdlib. Any stndlib types that we use are all the typical common ones which should be familiar to anyone familiar with Python. Looking at Griffe's documentation, there is no distinction between an internal link and an external link to the stdlib. I find it surprising that some of these links take me off-site while others to not. Perhaps if the stdlib links had a styling hook (such as a class on the Your screenshots are getting cut off, so I can't see this, but in the body of the documentation to the Finally, I am not opposed to using the "Google style" of defining parameters, etc. While it is not strictly Markdown, it is pretty close and appears to integrate well. |
Very valuable input once again, thanks a lot 🙂 I really like the idea. I believe it's easy enough to add a visual indication that a link brings us on an external site. I'll come back with screenshots once I have something 😄
It does not link as it's written,
We can in fact get rid of the semi-manual Docstring-only version: """
Other parameters: Keyword arguments:
extensions (list[Extension]): A list of extensions.
""" Note that It's equally easy with annotations, but I'll have to show another example because here we only have from markdown.extension import Extension
# it would work even when type checking:
from typing import TYPE_CHECKING:
if TYPE_CHECKING: # always false
from markdown.extension import Extension
def some_func(extensions: list[Extension]):
"""
Do something.
Parameters:
extensions: A list of extensions.
""" Note how we don't need to specify the type in the docstring anymore: it's fetched from the signature, and work exactly the same.
Great! It's usually well supported by IDEs, and there are flake8 plugins and other linters available. mkdocstrings itself is able to log warnings (fits well with MkDocs strict option) when discrepancies are found. |
Ah, okay. I guess I assumed we could. But if not, then so long as we can differentiate between internal and external, lets link to them all.
I can live with that. Actually, I think that is probably the simplest solution one could expect.
Note that the As far as whether we use |
Yes, we would typically either use
from typing import List, Union ...and in our docstring: """
extensions (List[Union[str, Extension]]): ...
"""
from __future__ import annotations ...and in our docstring: """
extensions (list[str | Extension]): ...
"""
Currently, annotations are rendered as written in the source, but when the annotation is not a full path, you just have to hover on it to see the full path. You can try it here for example: https://mkdocstrings.github.io/griffe/reference/griffe/agents/extensions/#griffe.agents.extensions.base.Extensions.after_children_inspection (try to hover on |
Cool. I wasn't aware of that. I find this syntax more readable. Let's go with it if we can.
This is why I dislike autogenerated documentation. Code should be written in a way that works best for the code and documentation should be written in the way that works best for documentation. One should never make a compromise in one for the other. Hovering helps, but I can't copy/paste that so it is still a compromise.
That is a valid concern. I suppose one approach would be to use the short version in the annotation ( Given the above, I think is makes sense to do this when documenting a parameter:
and this when documenting the class itself:
|
To be clear, I am speaking about what gets rendered. I understand that the annotation in the docstring may need to match the code so it knows what is being referenced. However, I expect the rendered documentation to consistently stick to a predefined style everywhere. If a tool can't do that, then its not worth my time. |
I agree that we need at least a way to consistently enforce the chosen style in specific parts of the docs (annotations in parameters section vs. headings), name-only or full-path, and not just rely on what was used in the source. I'll work towards that! |
Hello, I took some time to publish a demo of what's possible: https://pawamoy.github.io/markdown/sitemap.html. It's using latest versions of mkdocstrings, mkdocstrings-python (Insiders) and Griffe, as well as the mkdocs-section-index, mkdocs-literate-nav and mkdocs-gen-files plugins. The mkdocs-nature theme does not seem to handle HTML in headings ( My changes can be seen here: master...pawamoy:markdown:api-docs. I only took care of fixing the documentation warnings issued by Griffe (which collects data in the source code and parses docstrings). Most of the docstrings are therefore left untouched. I documented types of parameters and return values using actual annotations (with future annotations enabled) instead of writing them in docstrings. Many things can still be improved in the generated docs 🙂 Just wanted to tease a bit 😊 |
This is getting there. Looks like you have addressed all of my previously mentioned concerns. First thing I noticed when I visited your demo is weird blocks by all external links. It appears that you have an error in your CSS. I'm using the latest release of Chrome. And there is also some invalid HTML which needs to be tracked down. It seems that most of it is related to the Why do we have Speaking of tags, I find it odd that they are all Every module contains the same text at the top which lists authors and copyright notices etc. Is there a way to leave that in the code files but not include it in the generated documentation? Looking at the extensions (see example here), I see that we have a list of I would move the API documentation to a different location in the navigation/sitemap. It should be after I noticed a few places where a method accepts And now for some styling observations, which are always subjective... I don't care for the lines (left border) on nested blocks. To me is appears like blockquotes, which they are not. Perhaps indentation would be fine here??? I'm also not crazy about the class/function/method signatures. Each appears as a code block immediately after the heading. I understand why they are there. It's just that the styling makes them stand out too much IMO. I realize they are just using the default code block styles for the theme, so that is not your fault. I don't have any suggestions for improvement. And it may be this is something that the theme would need to address separately. I just thought I should mention it while I'm at it. |
Thanks for your extensive feedback @waylan, really appreciated!
I only ever use Firefox, so never noticed this issue on Chrome. I will try to see if there's an equivalent to mask-image in Chrome 🤔
Sure, we can remove those from Python-Markdown's docs, and generally turning them into spans is also a good idea. For comparison, the Material for MkDocs theme seems to strip those code tags from page titles (the actual HTML titles), so it doesn't break anything there (like tab titles, breadcrumbs, previous/next links in footer, etc.).
I initially simply copied the design from GitHub, where methods are also listed as "functions". That's also why I called them "symbol types". But I guess having
Somehow not showing docstrings of modules, or stripping away the copyright stuff sounds to me like a very, very specific case. I'd rather suggest to transform these info into Python comments, to keep only what's actually relevant to the module in its docstring. For example in # Python Markdown
# A Python implementation of John Gruber's Markdown.
# Documentation: https://python-markdown.github.io/
# GitHub: https://github.com/Python-Markdown/markdown/
# PyPI: https://pypi.org/project/Markdown/
# Started by Manfred Stienstra (http://www.dwerg.net/).
# Maintained for a few years by Yuri Takhteyev (http://www.freewisdom.org).
# Currently maintained by Waylan Limberg (https://github.com/waylan),
# Dmitry Shachnev (https://github.com/mitya57) and Isaac Muse (https://github.com/facelessuser).
# Copyright 2007-2018 The Python Markdown Project (v. 1.7 and later)
# Copyright 2004, 2005, 2006 Yuri Takhteyev (v. 0.2-1.6b)
# Copyright 2004 Manfred Stienstra (the original version)
# License: BSD (see LICENSE.md for details).
"""
POST-PROCESSORS
=============================================================================
Markdown also allows post-processors, which are similar to preprocessors in
that they need to implement a "run" method. However, they are run after core
processing.
"""
Good catch, that seems like a bug. When rendering objects, we check if the object itself has a docstring, or if any of its members (recursively) has a docstring, to know if we should render it. That's why classes like
Sure, will do.
I went too fast indeed, there are ways in both cases to stay in sync with the signature 👍
Will remove the borders, no problem 🙂
We can render the signature directly into the heading. But headings can only hold inline code, not code blocks, so long signatures in headings will not look so good. It's also not possible yet to add links (cross-refs) to names in inline signatures. That's not a big issue though because we already have those links in the rendered Parameters/Returns section. Will apply the changes and notify you once it's done 🙂 |
|
Looking good! 👍
That is what I expected the answer would be. We need to do that then (transform the copyright info into Python comments).
Yeah, I don't think we want the signature directly in the heading. It can stay-as-is for now. I was simply recording all of my observations. We may address this in the theme with some styling adjustments. Which raises a question: is there a styling hook applied to signatures which we can use to apply a style only to them and not all code blocks?
Thank you, I'm fine with that. One question though: when we go live, I don't want every page to indicate it was modified, only the ones that actually were. How to we address that? I assume that for future releases this will be a non-issue. I'm only asking about the initial release. |
The more I think about this, the more unsure I am about it. Looking back through the old comments in this discussion with screenshots, I like the signature in the header. And we don't need the links in the signature as they are provided in the Parameters/Returns section. There is one concern I do have though. I like the TOC for the page in its current form where only the function/class/method name is present with no arguments and/or punctuation. If we could retain the current TOC and have the signature in the header, I might prefer that. Or maybe it only works well when the entire thing fits on one line. Some of the sigs are pretty long, so that may be the deal breaker. I might need to see a side-by-side. |
No, but we can rely on the fact that signatures always appear after a heading, so this CSS should select only signature code blocks: .doc-heading + div.highlight {} I'm also open to adding a proper CSS class to signature code blocks to make this more robust.
I'll create a second fork and publish the same demo with signatures in headings 🙂 (maybe a bit brutal, please share if you know a lighter way). Also, the TOC label is set through the
If we auto-generate the pages, I don't think it's possible to prevent the latest build from updating dates or random numbers in assets/asset links, if that's what you mean. The only way I see would be to stop auto-generating the pages, and track them in Git. It means that pages would have to be updated each time a module is created, removed or renamed, but I believe that will happen very rarely here. If I got your question wrong, do let me know! |
One other thing, which is admittedly a nit-pick. I see that the various pages/modules are arranged alphabetically in the navigation. However, I would prefer them to be arranged logically. Is there a way to do that? I might suggest this order:
|
That would be preferable.
Or a side-by-side screenshot could suffice. Just grab the ugliest one you can find. 😁 Whichever is easier for you is fine with me.
Hmm. I guess I'm not sure how you determine changes have been made. What if we turn off the feature for the first release and then turn it on for all future releases? I realize there would be no |
See version with signatures in headings here: https://pawamoy.github.io/markdown-sig-head/reference/markdown/ We have different ways to choose how things are ordered. The first two won't work here, but thankfully there's a third one.
::: markdown
options:
summary:
modules: false # implicit if we don't set it to true
attributes: true
functions: true
classes: true and add such a section in the module docstring: """
Modules:
core: Description.
preprocessors: Description.
blockparser: Description.
blockprocessors: Description.
treeprocessors: Description.
inlinepatterns: Description.
postprocessors: Description.
serializers: Description.
util: Description.
htmlparser: Description.
test_tools: Description.
extensions: Description.
""" I'll try that and update the demo sites 🙂 |
Oooh, OK, I think there's a misunderstanding. Previous things you wrote make more sense to me now. I think you read the |
Both sites are updated with the new, correctly ordered modules summary 🙂 |
Ah, yes, I was reading them as "modified" not "module". 😊 You can ignore that concern.
Good to know. I'm half inclined to change the tags to full words, rather than abbreviations. 🤷♂️
Sounds good 👍
Thank you for providing that. It helps but I'm still undecided. If we go with sigs in the header we need to make a few adjustments to the styles.
|
I can get that. I cannot stand "meth" 🤣 On the other hand, "attribute" would be quite long in the TOC... Maybe we can use full symbol names for headings and no symbols at all for the TOC.
https://pawamoy.github.io/markdown-sig-head/reference/markdown/ updated |
I'm thinking we should use
I expected a
I maintain the codehilite extension and there never has been and never will be support for highlighting code spans, only code blocks. Therefore, there is no reason to ever apply the Yes, this means that the rendered HTML for separate sigs is different from sigs included in the headers. But I would consider anything other than that behavior a bug.
This is definitely cleaner. But I think most people would expect type annotations to be included in API documentation. I'm inclined to leave them in. Besides, not every docstring currently lists its parameters. Without the type annotations, we have too much missing info.
Understood. I can always tackle this later. By the way, I haven't seen the reordered modules in the demo but it appears that you did make the necessary change. Am I missing something? |
Good to know, thanks, we will update our highlight filter accordingly 👍
Ah, right, I didn't think of the navigation. The changes you mention updated the Modules bullet list on this page: https://pawamoy.github.io/markdown/reference/markdown/. I'll also update the script that auto-generates pages to order the modules in the navigation the same way 👍 |
Updated https://pawamoy.github.io/markdown-sig-head/sitemap.html
|
Looks good. 👍 Let's go with the sigs in headings. Feel free to merge that in. Another nit-pick: consider this permalink: By the way, there are a few places where I have noticed a collapsible In browsing through the demo, I do see various places where we will need to do some editing of the text in the code comments, but that can come later. |
The issue is that we rely on mkdocs-autorefs for the cross-references feature. autorefs stores a mapping of anchors and the pages they appear in. If we were to shorten Generally speaking, object anchors shouldn't depend on the page they're rendered into. They could very well be rendered into the top-level index page for example, or in a subpage whose URL has nothing to do with the module hierarchy. Since mkdocstrings does all the docs injection from a Markdown extension, it doesn't obviously have access to MkDocs pages data anyway (it's possible but not straight-forward).
The example you mention is clearly a bug. The lines are parsed as an admonition section (which is rendered with an HTML |
I was afraid you would say something like that. I don't like it, but I can deal with it.
I figured. I may have seen one or two others but don't recall where now.
The thought I would mention that I have checked in Chrome, Edge and Safari and they are all working correctly now. I think you can submit a PR with this whenever you are ready. Any additional edits to the documentation text can be done separately. Hopefully the bug can be resolved before we are ready for release (target is October after Python 3.12 is released). Although, I am curious to see what protestations we get from our CI server. For example, I'm guessing the spellchecker for the site will find a number of issues. Don't worry, I'm not expecting a perfect run on the first go-round. We can work through it. |
Updated demo, I've fixed the docstring parser to be more strict. % griffe dump -Ldebug -o/dev/null -fdgoogle markdown 2>&1 | grep reasons
DEBUG markdown/test_tools.py:91: Possible admonition skipped, reasons: Extraneous blank line below admonition title
DEBUG markdown/inlinepatterns.py:26: Possible admonition skipped, reasons: Missing blank line above admonition; Extraneous blank line below admonition title
DEBUG markdown/extensions/smarty.py:16: Possible admonition skipped, reasons: Extraneous blank line below admonition title These logs show that Griffe correctly identified these parts of the docstrings as regular markup and not admonitions. Will send a PR shortly 🙂 |
I was a little confused as to how those blocks of text could be confused as admonitions as they look nothing like Markdown admonitions. So I went looking at your documentation to figure out how your tool works. I see this page, which outlines three different styles for docstrings. But it is no clear to me which you are using. Neither is is clear how admonitions relate to any of that. I guess my point is that I have realized that I have no idea what the requirements are for formatting our docstrings. It would be necessary for this to be documented in our contributing guide. Whether that involves a link pointing to external documentation or internal documentation would depend on what's involved and how clear the external documentation might be. Also, I always expected that we would use our own Markdown parser for parsing Markdown formatted docstrings. Maybe I assumed this is what was happening, but now I am not so sure. |
Fair concerns. I'll try to answer them. Earlier in the conversation, I asked if you would be OK with using the Google-style: this is the one we're using here.
Obviously all this can be documented using plain Markdown. What the Google-style offers is a standardized syntax to document it. The benefit is that IDEs and tools can parse it as structured data to render it nicely for users, or for linting purposes (checking that all parameters of a function are documented for example). The syntax is the one described in the page you linked:
Supported section titles are Attributes, Parameters, Raises or Exceptions, Warnings, etc. That's as simple as that. The content of the section is not parsed again: it's simply picked up as it is, and in the case of mkdocstrings forwarded to Now, most sections support documenting multiple different items, so it adds a bit to the syntax:
A concrete example is the Parameters (or Params, Args, Arguments) section:
The Google-style standard specifies a few more things, like being able to specify the type of a parameter:
...but this particular feature is not super interesting to us as we can rather use type annotation in the signature of the function/method, and they are picked up by the Griffe docstring parser. Thanks to this syntax, we can conveniently structure the data to reuse it later. parameters = [
Parameter(name="foo", type="str", description="Description of `foo` parameter."),
...
] Same goes every other known sections (Attributes, Class, Functions, Methods, Modules, Deprecated, Other parameters, Warns, Raises, Yields, Receives, Returns, Examples). Now, for the sake of consistency, the Griffe parser will still pick up sections (title + indented contents) that it knows nothing about, and consider them to be admonitions. This is useful to keep the same style in a docstring while supporting very common admonitions like Note, Warning, etc.:
And it allows the admonition title to contain spaces, so this is a valid admonition:
...which leads us to your original concern. In the previous versions of Griffe, the parser did not require a blank line before a section or admonition (now it does), and allowed blank lines after the title (now it does not), so a paragraph like this:
...would be parsed as:
...which was obviously wrong. The latest version of Griffe is more strict about this. As stated above, it now requires a blank line before a section/admonition, and no blank line after the title, so the two following paragraphs are correctly parsed as regular markup:
However, the second block in next one would be parsed as an admonition again.
I think it caters to how we're used to write Markdown. I didn't go to such lengths to document this in Griffe, because I'm so used to Google-style docstrings (even before I had to write a parser for them) that I forget it is not immediately intuitive for everyone else. I'm completely open to improving the docs there to explain better how it works 🙂 A final note - the Numpydoc-style is kind of the same thing, just with a different syntax:
|
Thank you. That summary is very helpful. In fact, if that summary (or similar) had been in the documentation, I might have better understood. The issue is the mile-high overview. The details are documented just fine. I just wasn't understanding how the pieces all fit together. One thing I will note is that I find it strange that admonitions are being parsed at that stage (pre-markdown). I would expect an option to disable them and let Markdown handle that. I'm concerned that your current solution is too fragile and we could get more false-positives in the future. I'm fine going forward now, but would suggest that a future update to the tool provide a better solution. Also, where is the admonition feature documented? I never found it mentioned anywhere. |
Oops, that's not documented indeed 😅 The thing is that even if we add an option to disable Google-style admonitions, we still have to parse Google-style sections (which use the same syntax), so the same fragile algorithm will apply anyway. Disabling (with an option) Google-style admonitions would also mean that users have to write their admonitions following the syntax of Markdown extensions (for example markdown.admonition or markdown-callouts. I find it more pleasant to have a consistent style:
...rather than:
...or:
...but that's just my opinion of course, and I'm fine with whatever you decide for Python-Markdown's docs 🙂 Also, an alternative solution would be to write yet another Markdown extension that parses:
...into a note admonition. Why not 🤷 |
Perhaps due to the lack of documentation I missed this detail. The thing is, if I want an admonition, I want it to match the style of admonitions we already use elsewhere in our documentation. As your admonitions do not use the same HTML, they look nothing like the rest. Therefore, it would be preferred that they be disabled. I would expect the special words: Attributes, Parameters, Raises, Exceptions, Warnings, etc. to trigger the necessary sections but any random words to not trigger anything. It would seem that your admonitions can be triggered with any random text. That is the specific feature I would prefer to disable. |
I see, no problem, I can add such an option to Griffe 🙂 There's yet another alternative solution: https://gist.github.com/pawamoy/a12cb4a5f66d913519070b52b2a9d54b 😍 However it's definitely not standard 🤣 And not yet supported anyway 🙂 But I think I'll implement support in a Griffe extension and use it in one of my own projects to see how it goes 🤔 |
Cool. I wasn't aware of PEP 727. That is an interesting idea. Not sure that I like the proposed solution, but I like that they are trying to solve that specific issue. |
This is closed in #1379. |
Hello! Thanks a lot for Python Markdown, we'd do nothing without it 🙂
Would you be interested in providing API pages in your documentation, as well as a consumable objects inventory?
I'm willing to send a PR if that's something you'd like. Personally, I'd love to get automatic cross-refs to your
markdown
objects in my own API documentation pages.The idea is to use the mkdocstrings plugin (disclaimer: I'm the author) to inject some code reference into your pages.
When using mkdocstrings, every object that is rendered is added to an inventory file,
objects.inv
, and that file is added to the final site. Users can then load that inventory when rendering their docs, so that annotations likeMarkdown
will automatically link to theMarkdown
class in your API documentation.I tried it locally, and since it seems you're using pure Markdown docstring, the result is not so bad:
If you find this useful and would like to proceed, I'll need some information like: what is considered public and therefore should be rendered? Is it just the objects that can be imported from
markdown
directly? Or more objects in deeper submodules?If you find no interest in this, or don't want to proceed for other reasons (I see for example that you already provide a manually written reference page), that's totally fine, I won't push for it 🙂
The text was updated successfully, but these errors were encountered: