Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite logic in vlen to drop NumPy build dependency #555

Closed
wants to merge 2 commits into from

Conversation

jakirkham
Copy link
Member

As noted in issue ( https://github.com/zarr-developers/numcodecs ), vlen had problems with read-only data

Internally this occurred because we assigned the input array to object[:], which expects mutability. Ideally we would just replace this with const object[:]. However Cython does not support this ( cython/cython#2485 ). Over time various blockers have been resolved on that front, but it is not yet fixed

Fortunately an astute SciPy developer noted that const <fused_type>[:] is allowed ( scipy/scipy#18192 (review) ). We use that same trick here

This in turn should allow us to continue to accept read-only arrays. Though this does so without a NumPy build dependency, which can introduce more complexity to the build process


TODO:

  • Unit tests and/or doctests in docstrings
  • Tests pass locally
  • Docstrings and API docs for any new/modified user-facing classes and functions
  • Changes documented in docs/release.rst
  • Docs build locally
  • GitHub Actions CI passes
  • Test coverage to 100% (Codecov passes)

Add workaround for lack of `const object[:]` in Cython by using a trick
SciPy figured out using `fused` types.

This in turn lets us relax the NumPy build dependency.
As `vlen` now solves the mutability issue without a NumPy dependency,
drop the NumPy dependency to simplify building and deploying Numcodecs.
Comment on lines +30 to +36
# Hacky workaround for lack of `const object[:]`.
# Appears hiding `object` in a fused type works.
# xref: https://github.com/cython/cython/issues/2485
# xref: https://github.com/scipy/scipy/pull/18192#pullrequestreview-1359581611
ctypedef fused object_fused_t:
object
uintptr_t
Copy link
Member Author

@jakirkham jakirkham Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the crux of the workaround

For fusing, used a uintptr_t. As object is effectively PyObject*, fusing with a generic pointer/address type seemed most appropriate

Note this differs from the SciPy approach where they used double. Though it shouldn't matter much what the other type is as it is unused

Whenever Cython fixes this issue, we can drop this and just use const object[:] where needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well as evidenced by CI, this is not enough. Think what we need to do is get the fused types inside some kind of function signature. Then it might work

However looking at the code here. There are multiple places this would need to go

Wonder if the better path would be...

  1. Move most of this logic into pure Python (Cython is only needed for a small portion of this code)
  2. Refactor the code to reuse common bits across cases
  3. Have a small Cython function that deals with the bits that need to be lower-level

@dstansby
Copy link
Contributor

As per comments on conda-forge/numcodecs-feedstock#108, I don't think we need this any more

@dstansby dstansby closed this Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants