Contributing to Awkward Array

Thank you for your interest in contributing! We're eager to see your ideas and look forward to working with you.

This document describes the technical procedures we follow in this project. It should also be stressed that as members of the Scikit-HEP community, we are all obliged to maintaining a welcoming, harassment-free environment. See the Code of Conduct for details.

Where to start

The front page for the Awkward Array project is its GitHub README. This leads directly to tutorials and reference documentation that you may have already seen. It also includes instructions for compiling for development.

Reporting issues

The first thing you should do if you want to fix something is to submit an issue through GitHub. That way, we can all see it and maybe one of us or a member of the community knows of a solution that could save you the time spent fixing it. If you "assign yourself" to the issue (top of right side-bar), you can signal your intent to fix it in the issue report.

Contributing a pull request

Feel free to open pull requests in GitHub from your forked repo when you start working on the problem. We recommend opening the pull request early so that we can see your progress and communicate about it. (Note that you can git commit --allow-empty to make an empty commit and start a pull request before you even have new code.)

Please make the pull request a draft to indicate that it is in an incomplete state and shouldn't be merged until you click "ready for review."

Getting your pull request reviewed

Currently, we have three regular reviewers of pull requests:

Angus Hollands (agoose77)
Ioana Ifrim (ioanaif)
Jim Pivarski (jpivarski)

You can request a review from one of us or just comment in GitHub that you want a review and we'll see it. Only one review is required to be allowed to merge a pull request. We'll work with you to get it into shape.

If you're waiting for a response and haven't heard in a few days, it's possible that we forgot/got distracted/thought someone else was reviewing it/thought we were waiting on you, rather than you waiting on us—just write another comment to remind us.

Becoming a regular committer

If you want to contribute frequently, we'll grant you write access to the scikit-hep/awkward repo itself. This is more convenient than pull requests from forked repos.

Git practices

Unless you ask us not to, we might commit directly to your pull request as a way of communicating what needs to be changed. That said, most of the commits on a pull request are from a single author: corrections and suggestions are exceptions.

Therefore, we prefer git branches to be named with your GitHub userid, such as jpivarski/write-contributing-md.

The titles of pull requests (and therefore the merge commit messages) should follow these conventions. Mostly, this means prefixing the title with one of these words and a colon:

feat: new feature
fix: bug-fix
perf: code change that improves performance
refactor: code change that neither fixes a bug nor adds a feature
style: changes that do not affect the meaning of the code
test: adding missing tests or correcting existing tests
build: changes that affect the build system or external dependencies
docs: documentation only changes
ci: changes to our CI configuration files and scripts
chore: other changes that don't modify src or test files
revert: reverts a previous commit

Almost all pull requests are merged with the "squash and merge" feature, so details about commit history within a pull request are hidden from the main branch's history. Feel free, therefore, to commit with any frequency you're comfortable with.

It is unnecessary to manually edit (rebase) commit history within a pull request.

Building and testing locally

The installation for developers procedure is described in brief on the front page, and in more detail here.

Awkward Array is shipped as two packages: awkward and awkward-cpp. The awkward-cpp package contains the compiled C++ components required for performance, and awkward is only Python code. If you do not need to modify any C++ (the usual case), then awkward-cpp can simply be installed using pip or conda.

Subsequent steps require the generation of code and datafiles (kernel specification, header-only includes). This can be done with the prepare nox session:

nox -s prepare

The prepare session accepts flags to specify exact generation targets, e.g.

nox -s prepare -- --tests --docs

This can reduce the time taken to perform the preparation step in the event that only the package-building step is needed.

nox also lets us re-use the virtualenvs that it creates for each session with the -R flag, eliminating the dependency reinstall time:

nox -R -s prepare

Installing the `awkward-cpp` package

The C++ components can be installed by building the awkward-cpp package:

python -m pip install ./awkward-cpp

If you are working on the C++ components of Awkward Array, it might be more convenient to skip the build isolation step, which involves creating an isolated build environment. First, you must install the build requirements:

python -m pip install "scikit-build-core[pyproject,color]" pybind11 ninja cmake

Then the installation can be performed without build isolation:

python -m pip install --no-build-isolation --check-build-dependencies ./awkward-cpp

Installing the `awkward` package

With awkward-cpp installed, an editable installation of the pure-python awkward package can be performed with

python -m pip install -e .

Testing the installed packages

Finally, let's run the integration test suite to ensure that everything's working as expected:

python -m pytest -n auto tests

For more fine-grained testing, we also have tests of the low-level kernels, which can be invoked with

python -m pytest -n auto awkward-cpp/tests-spec
python -m pytest -n auto awkward-cpp/tests-cpu-kernels

This assumes that the nox -s prepare session ran the --tests target.

Furthermore, if you have an Nvidia GPU and CuPy installed, you can run the CUDA tests with

python -m pytest tests-cuda-kernels
python -m pytest tests-cuda

Building wheels

Sometimes it's convenient to build a wheel for the awkward-cpp package, so that subsequent re-installs do not require the package to be rebuilt. The build package can be used to do this, though care must be taken to specify the current Python interpreter in pipx:

pipx run --python=$(which python) build --wheel awkward-cpp

The built wheel will then be available in awkward-cpp/dist.

Automatic formatting and linting

The Awkward Array project uses pre-commit to handle formatters and linters. This automatically checks (and may push corrections to) your pull request's git branch.

To respond more quickly to pre-commit's feedback, it can help to install it and run it locally. Once it is installed, run

pre-commit run -a

to test all of your files. If you leave off the -a, it will run only on currently stashed changes.

Automated tests

As stated above, we use pytest to verify the correctness of the code, and GitHub will reject a pull request if either pre-commit or pytest fails (red "X"). All tests must pass for a pull request to be accepted.

Note that if a pull request doesn't modify code, only the documentation tests will run. That's okay: documentation-only pull requests only need the documentation tests to pass.

Testing practices

Unless you're refactoring code, such that your changes are fully tested by the existing test suite, new code should be accompanied by new tests. Our testing suite is organized by GitHub issue or pull request number: that is, test file names are

tests/test_XXXX-yyyy.py

where XXXX is either the number of the issue your pull request fixes or the number of the pull request and yyyy is descriptive text, often the same as the git branch. This makes it easier to run your test in isolation:

python -m pytest tests/test_XXXX-yyyy.py

and it makes it easier to figure out why a particular test was added. The easiest way to make a new testing file is to copy an existing one and replace its test_zzzz functions with your own. The previous tests should also give you a sense of the way we test things and the kinds of things that are constrained in tests.

Building documentation locally

Documentation is automatically built by each pull request. You usually won't need to build the documentation locally, but if you do, this section describes how.

We use Sphinx to generate documentation. You may need to install some additional packages:

Doxygen
pycparser
black
sphinx
sphinx-rtd-theme

To build documentation locally, first prepare the generated data files with

nox -s prepare

Only the --headers and --docs flags are actually required at the time of writing. These can be passed with:

nox -s prepare -- --docs --headers

Then, use nox to run the various documentation build steps

nox -s docs

This command executes multiple custom Python scripts (some require a working internet connection), in addition to using Sphinx and Doxygen to generate the required browser viewable documentation.

To view the built documentation, open

docs/_build/html/index.html

from the root directory of the project in your preferred web browser, e.g.

python -m http.server 8080 --directory docs/_build/html/

Before re-building documentation, you might want to delete the files that were generated to create viewable documentation. A simple command to remove all of them is

rm -rf docs/reference/generated docs/_build docs/_static/doxygen

There is also a cache in the docs/_build/.jupyter_cache directory for Jupyter Book, which can be removed.

The main branch

The Awkward Array main branch must be kept in an unbroken state. There are two reasons for this: so that developers can work independently on known-to-be-working states and so that users can test the latest changes (usually to see if the bug they've discovered is fixed by a potential correction).

The main branch is also never far from the latest released version. We usually deploy patch releases (z in a version number like x.y.z) within days of a bug-fix.

Committing directly to main is not allowed except for

updating the pyproject.toml file to increase the version number, which should be independent of pull requests
updating documentation or non-code files
unprecedented emergencies

and only by the the reviewing team.

The main-v1 branch

The main-v1 branch was split from main just before Awkward 1.x code was removed, so it exists to make 1.10.x bug-fix releases. These commits must be drawn from main-v1, not main, and pull requests must target main-v1 (not the GitHub default). A single commit cannot be applied to both main and main-v1 because they have diverged too much. If a bug-fix needs to be applied to both (unlikely), it will have to be reimplemented on both.

Releases

Currently, only one person can deploy releases:

Jim Pivarski (jpivarski)

There are two kinds of releases: (1) awkward-cpp updates, which only occur when the C++ is updated (rare) and involves compilation on many platforms (takes hours), and (2) awkward updates, which can happen with any bug-fix. The releases listed in GitHub are awkward releases, not awkward-cpp.

If you need your merged pull request to be deployed in a release, just ask!

Awkward-cpp releases

When making an awkward-cpp release (1), the following manual steps must also be taken:

Creating a git tag awkward-cpp-{version} for the new version epoch.

Awkward releases

When making an awkward release (2), the following manual steps must also be taken:

Attaching the headers.zip from the deploy.yml workflow to the release artefacts.
Adding a doc/switcher.json entry for new minor/major versions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRIBUTING.md

CONTRIBUTING.md

Contributing to Awkward Array

Where to start

Reporting issues

Contributing a pull request

Getting your pull request reviewed

Becoming a regular committer

Git practices

Building and testing locally

Installing the `awkward-cpp` package

Installing the `awkward` package

Testing the installed packages

Building wheels

Automatic formatting and linting

Automated tests

Testing practices

Building documentation locally

The main branch

The main-v1 branch

Releases

Awkward-cpp releases

Awkward releases

Files

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to Awkward Array

Where to start

Reporting issues

Contributing a pull request

Getting your pull request reviewed

Becoming a regular committer

Git practices

Building and testing locally

Installing the awkward-cpp package

Installing the awkward package

Testing the installed packages

Building wheels

Automatic formatting and linting

Automated tests

Testing practices

Building documentation locally

The main branch

The main-v1 branch

Releases

Awkward-cpp releases

Awkward releases

Installing the `awkward-cpp` package

Installing the `awkward` package