TudatPy structure #159

alfonsoSR · 2024-08-08T11:52:44Z

Issues addressed by this PR

Critical

Import time: Compiling tudatpy into a single library (kernel) results into long import times regardless of how small the part of tudatpy being loaded is.
Docstrings: Docstrings are currently defined in YAML files and added to functions with a package called multidoc. Separating the docstrings from the source code makes it difficult to know if a function is documented or if its documentation is up to date, it is not standard or common and complicates the development.
Syntax higlighting: The inability to inspect the kernel currently prevents proper syntax highlighting and autocompletion and results into linters throwing warnings at correct code.
Autogeneration of __init__.py files: The autogeneration of __init__.py is problematic for modules including both C++ and Python functionality, adds complexity to the build process and makes it difficult for new developers to understand how to contribute code.
Organization of source code: It would be nice to have a more intuitive organization of the source code used for the exposition (e.g. having expose_X.cpp under tudatpy/X rather than tudatpy/kernel/X as the former also contains the __init__.py file and, potentially, python-native functionality). The presence of _import_all_kernel_modules.py files and the fact that they are autogenerated makes it difficult to understand where to contribute code or where to find the source code of the function you are using.

Non-critical

Project layout: Tudatpy currently uses a flat layout (i.e. tudatpy/__init__.py). Migrating to a src layout (i.e. tudatpy/src/tudatpy/__init__.py) would allow for the creation of a stub-only package at no cost and significantly simplify a potential unification of tudat and tudatpy into a standalone repository (e.g. adding tudat, spice, sofa... to tudatpy/src). Both flat and src layouts are standard and accepted, but I consider the latter more suitable for this kind of project.

Solutions

Import time

I modified the build process to generate one shared library per submodule (i.e. per expose_X.cpp file). This results into import times being proportional to the size of the imported modules (e.g. import time will still be relatively high for a program loading numerical_simulation, but not for one loading math or astro, since those modules are much smaller), which could be reduced by splitting big submodules per functionality (e.g. numerical_simulation -> dynamics & estimation or estimation_setup.observation being splitted into submodules) or optimizing the compilation process.

Organization of source code

I moved each expose_X.cpp script to its submodule's directory (as suggested above) and modified the build process such that the associated kernel is saved in the same directory. This allows for the replacement of the autogenerated __init__.py and _import_all_kernel_modules.py files with a standard __init__.py scripts to be maintained by developers. Thus, a hybrid submodule (one including exposed and python-native functionality) X would have the following structure:

tudatpy/src/tudatpy/X
    |_ __init__.py
    |_ expose_X.cpp
    |_ expose_X.so  (after compilation)
    |_ foo.py

and the __init__.py file would look as follows:

from .expose_X import *
from .foo import <public-functionality-from-foo.py>    # Manually specified by users

Under the assumption that all the functionality defined in expose_X.cpp should belong to the API, the star import from expose_X prevents developers from having to update the __init__.py files when they expose a new function, effectively replacing autogeneration, while the manual imports from the python scripts allow them to keep "private" functions and helpers out of tudatpy's API. The presence of the shared libraries in the source directories is just a personal preference, but moving them into a lib or kernel directory would not be a problem.

NOTE: A positive side effect of having split the kernel is that the header files associated with the exposition scripts are no longer needed.

Docstrings

For each exposed function, I read the content of the docstring from the YAML file, gave it an adequate format and pasted it together with the source code that exposes the function. This eliminates the need for multidoc and the results into docstrings being included by default in all tudatpy distributions. As an example, this is the code snippet used to expose spice.load_kernel():

m.def("load_kernel", &tudat::spice_interface::loadSpiceKernelInTudat,
          py::arg("kernel_file"),
          R"doc(Loads a Spice kernel into the pool.

          This function loads a Spice kernel into the kernel pool, from which it can be used
          by the various internal spice routines. Matters regarding the manner in which Spice
          handles different kernels containing the same information can be found in the spice
          required reading documentation, kernel section. Wrapper for the
          `furnsh_c <https://naif.jpl.nasa.gov/pub/naif/toolkit_docs/C/cspice/furnsh_c.html>`_ function.

          :param file_path: Path to the spice kernel to be loaded.
          )doc");

Syntax highlighting

Taking #154 as reference, I included automatic stub generation as part of the build process. Stubs are generated using pybind11-stubgen following the compilation of tudatpy, and stored in a stub-only package with the same structure as the source directory, as described in Packaging type information.

The package is automatically installed together with tudatpy and provides syntax highlighters with information about the available functionality per submodule, function and class signatures, docstrings... This allows for regular autocompletion and greatly simplifies the access to typing and usage information.

Some notes on stub generation

The reason why I opted for pybind11-stubgen instead of mypy's stubgen is that the latter is not compatible with the generation of stub-only packages, is bad at inferring the signature of exposed functions and includes undesired information in the docstrings.

While significantly better than stubgen, pybind11-stubgen includes undesired from __future__ imports and is not particularly good at generating stubs for __init__.py files. Thus, while the vast majority of the stub generation process relies on pybind11-stubgen, I wrote some post-processing functions and a parser to generate __init__.pyi stubs, remove undesired imports and ensure proper indentation within docstrings.

Missing

API: This PR is not necessarily compatible with the current way in which the API website is generated. This will be addressed through an additional PR in the near future.
CMake: The current CMake setup relies on deprecated functionality (we require CMake 2.8 when the latest version is 3.30) and non-standard tools such as YACMA. The resulting setup is stiff and difficult to understand, so it should be updated. I attempted this and managed to make it work locally, but not on Azure. More information on [ADD ISSUE]

- Splitted tudatpy kernel into submodules - Replaced YACMA with standard method to find python (CMake) - Introduced standard src structure for future integration of stubs

…l_simulation

- Moved sbdb and horizons interface from tudatpy/data to numerical simulation to avoid circular imports - Fixed typing issues in python scripts: data, util - A lot of functionality still relies on tudatpy/io, which is currently deprecated, so I moved it to tudatpy/data.

- Defined a CMake-based installation configuration based on the old version of tudatpy as an attempt to fix installation issues on azure. - Replaced stub-only package with in-tree stubs, which seem to work better with current code editors.

alfonsoSR added 5 commits August 6, 2024 13:22

Use pybind

1da7740

tmp

0678a84

Split kernel into submodules

e4aa905

- Splitted tudatpy kernel into submodules - Replaced YACMA with standard method to find python (CMake) - Introduced standard src structure for future integration of stubs

Remove YACMA

564e9af

Remove old tudatpy src directory

ed82c55

alfonsoSR changed the title ~~Restructure TudatPy~~ TudatPy structure Aug 8, 2024

alfonsoSR added 24 commits August 8, 2024 14:09

Restore spice __init__.py

1cbc98a

WIP: Standalone cmake + stubs

73efa21

WIP: stubs

0d06f41

WIP: Stubs

c71eec4

Integrated docstrings + stub generation. Circular imports in numerica…

e426405

…l_simulation

Solve circular imports

8255002

Include dir for pybind11

8b620a9

WIP: Fix issues with python scripts

de11783

Add build commands to cmake

aefffd1

Add tudat include dir

ff6d7bf

Show tudat environment

1565a49

System include pybind11

39f2114

Add include statements to add_extension macro

3726d06

Solar activity + NRLMSISE00

b9d2219

Modified CMake configuration to resemble original

223cca3

Delete old versions of cmake config

3df1ca8

Remove bodies, cli and apps

7826d2a

Stub generation

3b5d6fc

- Moved sbdb and horizons interface from tudatpy/data to numerical simulation to avoid circular imports - Fixed typing issues in python scripts: data, util - A lot of functionality still relies on tudatpy/io, which is currently deprecated, so I moved it to tudatpy/data.

Remove dependency on docstrings.h

53742ef

expose_environment

a9aea72

Star imports and fix in expose_environment

ab1ec7b

Back to working expose_environment

6f7dd8e

Fix compatibility issues with latests tudat version

494e8f3

Support for smart stub generation

a5a21ac

alfonsoSR added 27 commits September 7, 2024 19:05

Attempt pip-based azure build

bb21992

Attempt build.py-based build on azure

4bc24ec

Fix 3.9/10 compatibility issues + Add install script

341f01b

Demo cmake-based installation

a245959

Stub installation

06ccf09

Stubs

8131303

Stub generation - 2

9bf6096

More issues with stubs

99cb522

Clean CMakeLists.txt

ec1a5a9

Separate build from stub generation

fc45343

Fix: invalid relative import

df1d468

Attempt pip-based conda build

1d4015a

Test

961397f

Disable automatic stub generation on build and add stubs to git repo

56dc764

Add install configuration to cmake

5b87e7d

CMake install configuration & inplace stubs

4f5cedc

- Defined a CMake-based installation configuration based on the old version of tudatpy as an attempt to fix installation issues on azure. - Replaced stub-only package with in-tree stubs, which seem to work better with current code editors.

YACMA-based installation

3e9643f

Fix issues with tests and python include dir (attempt) - No stubs

a3aa65b

Update installation and add stubs

6eeb0b4

Also look for Python with pybind11

05ed975

Remove python version check

f9bcfca

Remove python version check

d943004

Use PythonSetup

9a4a04b

Bump cmake version

522eee6

YACMA Python setup

f21ba84

Don't find python with pybind11

8c39884

Submodules & docstrings: v1.0

228601a

alfonsoSR marked this pull request as ready for review September 30, 2024 15:20

alfonsoSR mentioned this pull request Oct 1, 2024

Compatibility with submodules and docstrings tudat-team/tudatpy-feedstock#18

Open

DominicDirkx mentioned this pull request Oct 14, 2024

Feature/lean kernel exposure #154

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TudatPy structure #159

TudatPy structure #159

alfonsoSR commented Aug 8, 2024 •

edited

Loading

TudatPy structure #159

Are you sure you want to change the base?

TudatPy structure #159

Conversation

alfonsoSR commented Aug 8, 2024 • edited Loading

Issues addressed by this PR

Critical

Non-critical

Solutions

Import time

Organization of source code

Docstrings

Syntax highlighting

Some notes on stub generation

Missing

alfonsoSR commented Aug 8, 2024 •

edited

Loading