Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated API reference #570

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Automated API reference #570

wants to merge 12 commits into from

Conversation

Yoshanuikabundi
Copy link
Contributor

This PR replaces the laid-out-by-hand API reference with a fully automatic one. It uses Autosummary templates I included in the new theme to make the entire API render with each object on its own page, including Pydantic models with autodoc_pydantic. This has the major benefit of making the reference docs mirror the layout of the actual API; this allows familiar readers to use the very large spatial reasoning parts of the brain to mentally navigate the API by visualising the website, and generally reduces confusion and aids learning. It also has the major advantage of documenting every public object by default, rather than requiring authors to remember to put it in the right spot in the docs directory!

The PR also adds a lot of module-level docstrings that quickly describe the workings of the contents of the module. This allows readers to get some feeling for where they are in the codebase, and also can be displayed by IDEs, the Python help function, mouseover on Jupyter lab, etc. Much of this information was previously in the hand-written API reference, and moving it into the code makes it more accessible and possibly more likely to be updated.

Finally, this PR adds a few new modules and re-exports to make the layout of the API more intuitive. Instead of having a large number of objects found only in the root module, they are now found in a more appropriate place and re-exported from the root. __all__ is used to specify a single place to access any given object. I haven't removed any API points because (1) having re-exports in an ergonomic place is good and (2) there's no need to break user workflows.

One catch to this approach is that many tab-complete tools will look at __all__ to choose completions; modules with re-exports not included in __all__ (because they are included in __all__ somewhere else) but with __all__ defined will often not be able to provide those auto-completes. I have worked around this in some modules including the root by not including an __all__ and leaning on the difference in behavior between Autosummary and other tools - when __all__ is missing, Autosummary is configured to default to "all objects defined in the module (ie, not imported) and all sub-modules, except those objects and submodules that are named with a leading underscore". By contrast, tab completion tools simply provide all completions. This is appropriate because a very long tab completion list is fine when you know the first couple of characters and can quickly filter it down, but in an API reference you need a more cultivated list.

Another catch of using __all__ is that in modules where it is used, code authors need to remember to amend it when they add new stuff. IDEs often remind you if an object is unused to put it in __all__, and __all__ is at least closer to the change you made than some autosummary declaration, but this still reduces the benefit of automating the API reference. For that reason, I've tried to only include __all__ declarations in modules that only include re-exports, so that modules where people actually write code can fall back on the no-imports behavior which is correct in most cases. In these modules, API points can be removed from documentation by prefixing their names with an underscore. Authors adding submodules to modules where __all__ is defined still need to be careful!

There's a lot of flexibility with this approach, and I haven't finished writing docstrings, but I wanted to get this PR out there so we can decide what to do!

This change surfaces a lot more docstrings, so there are some warnings - I've allowed RTD to build successfully in spite of them to get a demo up.

Developers certificate of origin

@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (47cb562) 91.38% compared to head (2a87237) 91.41%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #570      +/-   ##
==========================================
+ Coverage   91.38%   91.41%   +0.03%     
==========================================
  Files         117      121       +4     
  Lines        7358     7385      +27     
==========================================
+ Hits         6724     6751      +27     
  Misses        634      634              
Files Coverage Δ
openfe/__init__.py 100.00% <100.00%> (ø)
openfe/analysis/__init__.py 100.00% <100.00%> (ø)
openfe/analysis/plotting.py 100.00% <100.00%> (ø)
openfe/orchestration/__init__.py 100.00% <100.00%> (ø)
openfe/protocols/__init__.py 100.00% <100.00%> (ø)
openfe/protocols/openmm_rfe/__init__.py 100.00% <100.00%> (ø)
openfe/protocols/openmm_rfe/equil_rfe_methods.py 93.04% <ø> (ø)
openfe/protocols/openmm_rfe/equil_rfe_settings.py 100.00% <ø> (ø)
openfe/protocols/settings.py 100.00% <100.00%> (ø)
openfe/setup/__init__.py 100.00% <100.00%> (ø)
... and 9 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pep8speaks
Copy link

pep8speaks commented Oct 6, 2023

Hello @Yoshanuikabundi! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2023-10-11 00:49:54 UTC

@Yoshanuikabundi Yoshanuikabundi marked this pull request as ready for review October 11, 2023 00:48
Copy link
Contributor

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem I can see with this is that now gufe.ChemicalSystem isn't documented anywhere, since it's "foreign". There's a few gufe objects we need to document as if they were actually openfe objects. Similarly there'll be other objects from other packages that we will likely want documented in the openfe doc pages. openfe is meant to be a one-stop package for running stuff, so the docs should equally be complete.

I didn't initially like losing the ability to define the order in which objects are defined in the rst, but is this now possible by documenting this in the __init__.py?

@dwhswenson
Copy link
Member

One problem I can see with this is that now gufe.ChemicalSystem isn't documented anywhere, since it's "foreign".

It is under openfe.setup.system. See also system.py.

There's a fair question as to whether this should be so deeply nested (I'd rather not require nesting below openfe.setup except, perhaps for specific subtools (openfe.setup.lomap maybe? or openfe.setup.lomap_scorers?) Those unlikely-to-be-used scorers make sense to nest more deeply; basics like ChemicalSystem will be hard to find if so deeply nested.

I think the main points @Yoshanuikabundi is raising are:

  1. Where to import things is inconsistent. Some things are nested under openfe.setup, some are top-level openfe. There doesn't seem to be a clear rule as to which goes where. This is a source of confusion in our API which we should definitely clear up; this PR is one suggestion.
  2. Using this pattern (based on __all__ to expose the API, explicitly re-exporting thing imported from elsewhere, combined with module-level docstrings) does a lot to help tools that can automatically work from imports. That includes docs, but also includes autocomplete, etc. Implicitly re-exporting via from gufe import Y (in openfe/__init__.py) works to allow openfe.Y, but it won't be automatically added to docs unless it is explicitly re-exported. It also forces us to more carefully curate our API, which is probably a good idea.
  3. There's generally a strong case for saying that the API reference docs should live in the code as much as possible. (This is exactly what is done in the CPython std lib, and while it makes the source files a lot longer, I've found it useful when looking into CPython implementations.) If the API reference docs have a document structure that matches the import structure, then documentation updates can also be more automated.

As I think about it, ChemicalSystem et al. seem like a good example for my main concern with this approach. This is less about ordering, and more about grouping. Here's my logic:

  • If we recommend that this should be imported from openfe.setup (as opposed to openfe.setup.system), then this should be documented on the page for openfe.setup.
  • The stuff currently in openfe.setup.system should be grouped together in the documentation, and separated from other things that might be importable from openfe.setup.
  • Therefore, the options I see are:
    • Somehow organize objects into groups in openfe/setup/__init__.py (I assume we can use autosummary here, although that will break the automatic add of new functionality)
    • Find a way to do a separate file for the stuff in openfe.setup.system, but document with imports from openfe.setup. I'm not entirely sure how to do this if the API docs live with the code; doing it in the docs/ directory is similar to what we're doing in the existing docs.

@@ -0,0 +1,47 @@
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens here if we have settings for different protocols? Do we just add them all here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was that the settings module would only be for settings objects that are (intended to be) re-used by multiple protocols, and that a protocol's other settings objects would be in that Protocol's own module - so for instance, RelativeHybridTopologyProtocolSettings is in openfe.protocols.openmm_rfe. So it might be more consistent to move AlchemicalSamplerSettings to that module if that's not really re-usable by other protocols - I had to make a guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So right now, anything in openmm_utils is "reusable", but only within the scope of OpenMM protocols. (Anything in a specific protocol is local to that protocol) When we move to gmx I expect most these to not apply.

@Yoshanuikabundi
Copy link
Contributor Author

Yoshanuikabundi commented Oct 17, 2023

One problem I can see with this is that now gufe.ChemicalSystem isn't documented anywhere, since it's "foreign".

As @dwhswenson mentioned, foreign imports can be documented, same as local imports, by including them in __all__ - I implemented this in Sphinx myself! openfe.setup.system.ChemicalSystem

The fact that any given item can only be documented once is a real problem. I would like each item to have a single "canonical" location, and for other places in the API to list it as a re-export and point there. I'm actually preparing a CZI grant to try and implement this (among a bunch of other things). But for now, if an item is in two places in the API it raises a warning, so we have to either accept a lot of warnings and having multiple different pages for the same thing and making links a pain, or only documenting each item in a single place. If you want the reference to mirror the structure of the API and also to organise the API reference so people can find things, you kinda have to pick the deeply nested place. Though we could do stuff like move it to openff.setup, or move openff.setup.system to openff.system, and so on.

I didn't initially like losing the ability to define the order in which objects are defined in the rst, but is this now possible by documenting this in the init.py?

Objects are documented in the order given in __all__, with the caveat that they are first sorted into functions, classes, exceptions etc. If __all__ is not defined, they should be given in the same order as they are defined in the code.

I'd encourage you all to check out how the API reference is rendered if you haven't already. Compare it to the status quo and try to imagine you've never seen the API before and you're trying to figure out what you want to do.

@richardjgowers
Copy link
Contributor

Ok so openfe.setup.system.X is much too far nested for something which is going to be one of the first objects you use. It currenly (also) lives at openfe.ChemicalSystem.

Looking at the live docs compared to this PR, this nests the docs for these objects another level deeper.

So current:
"API Reference" -> "Chemical Systems and Components"

With this PR:
"API Reference" -> "setup" -> "system"

Where the second two links to click on aren't intuitive at all

@richardjgowers richardjgowers self-assigned this Oct 18, 2023
@Yoshanuikabundi
Copy link
Contributor Author

Yeah. I'm hoping to support re-exports in a future update to Sphinx, but for now each item has to have a single location in the reference. One option would be to move openfe.setup.system to openfe.system, and/or to rename system to something more intuitive.

@mikemhenry
Copy link
Contributor

I'd encourage you all to check out how the API reference is rendered if you haven't already. Compare it to the status quo and try to imagine you've never seen the API before and you're trying to figure out what you want to do.

For me, this is the most compelling reason to prefer the new format.

Ok so openfe.setup.system.X is much too far nested for something which is going to be one of the first objects you use.

I don't think that is a fair test, since I don't think the motivation for this is "landing page that highlights the objects you should care about" but "doc layout matches code layout and doc strings for modules aids in looking up what you are trying to find"

Maybe it is the difference between a guide and a reference, if I just want to look something up, I want it to exist where I think it should and have some sign posts that help me find it, which PR helps with.

@mikemhenry
Copy link
Contributor

Also I really love this pirate font

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants