Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): support ellipsis indexing #1729

Merged
merged 25 commits into from
Oct 30, 2024
Merged

(feat): support ellipsis indexing #1729

merged 25 commits into from
Oct 30, 2024

Conversation

ilan-gold
Copy link
Contributor

@ilan-gold ilan-gold commented Oct 22, 2024

Not sure if this is a feature or a fix TBH but going with feature and putting it in 0.10 because it addresses a pretty big use-case that broke without it (so in a not-so-strict sense, is a bug)

@ilan-gold ilan-gold added this to the 0.10.10 milestone Oct 22, 2024
Copy link

codecov bot commented Oct 22, 2024

Codecov Report

Attention: Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.

Project coverage is 84.51%. Comparing base (437dbc8) to head (37fd1e0).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/anndata/_core/index.py 90.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1729      +/-   ##
==========================================
- Coverage   86.93%   84.51%   -2.43%     
==========================================
  Files          40       40              
  Lines        6039     6050      +11     
==========================================
- Hits         5250     5113     -137     
- Misses        789      937     +148     
Files with missing lines Coverage Δ
src/anndata/_core/anndata.py 83.72% <ø> (ø)
src/anndata/compat/__init__.py 81.54% <100.00%> (-3.31%) ⬇️
src/anndata/_core/index.py 94.96% <90.00%> (+1.63%) ⬆️

... and 7 files with indirect coverage changes

tests/test_views.py Outdated Show resolved Hide resolved
docs/conf.py Show resolved Hide resolved
@@ -130,7 +140,7 @@ def _fix_slice_bounds(s: slice, length: int) -> slice:
return slice(start, stop, step)


def unpack_index(index: Index) -> tuple[Index1D, Index1D]:
def unpack_index(index: Index) -> tuple[IndexRest, IndexRest]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t this be (Index without Ellipsis) -> tuple[Index1D, Index1D]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because this can return EllipsisType on either of the entries

Copy link
Member

@flying-sheep flying-sheep Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is that it probably shouldn’t. see here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, Index can contain an ellipsis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but @flying-sheep then if we want to use unoack_index everywhere, we need to let it accept tuples of length > 2

Co-authored-by: Philipp A. <[email protected]>
Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our index handling isn’t very consistent, and while parts of that make sense (BaseCompressedSparseDataset doesn‘t need to handle string indexing), we should decide what we want. Currently we use unpack_index in 3 places:

  • in anndata._core.index._normalize_indices, you remove the ellipses first, so unpack_index never gets an ellipsis passed from there.
  • Raw doesn’t use _normalize_indices, it seems to have some hacky slimmed down version of it that raises when an index tuple isn’t len() 2. That might be OK since we want to phase out Raw.
  • BaseCompressedSparseDataset has an even simpler version that just passes in the index unchanged to unpack_index. This means that trying to index it with something that one doesn’t support breaks with a hard-to-decipher error. That one should probably support Ellipses

so we need to decide:

  1. should unpack_index handle all cases? or does it only handle preprocessed indices?
  2. how do we reuse code here smartly? We should optimally only remove ellipses in one place of our code base, but still support both classes that allow string indexing and classes that don’t

@@ -130,7 +140,7 @@ def _fix_slice_bounds(s: slice, length: int) -> slice:
return slice(start, stop, step)


def unpack_index(index: Index) -> tuple[Index1D, Index1D]:
def unpack_index(index: Index) -> tuple[IndexRest, IndexRest]:
Copy link
Member

@flying-sheep flying-sheep Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is that it probably shouldn’t. see here

@ilan-gold
Copy link
Contributor Author

With the exception of raw - I could fix there, but we don't test for it and want to get rid of it anyway. someone would have to do adata.raw[ellipsis_index] and I don't think they should, quite frankly. adata[ellipsis_index].raw is fine

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! does this give BaseCompressedSparseDataset the ability to be indexed with ... as well?

src/anndata/_core/index.py Outdated Show resolved Hide resolved
tests/test_backed_sparse.py Outdated Show resolved Hide resolved
src/anndata/compat/__init__.py Show resolved Hide resolved
@ilan-gold
Copy link
Contributor Author

awesome! does this give BaseCompressedSparseDataset the ability to be indexed with ... as well?

It should, I added tests for it

tests/test_views.py Outdated Show resolved Hide resolved
tests/conftest.py Outdated Show resolved Hide resolved
Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking great! there are a few minor issues with the types, otherwise perfect!

I’ll just quickly fix those issues

tests/conftest.py Outdated Show resolved Hide resolved
@ilan-gold ilan-gold merged commit 0024b82 into main Oct 30, 2024
15 checks passed
@ilan-gold ilan-gold deleted the ig/ellipsis_indexing branch October 30, 2024 14:00
ilan-gold added a commit that referenced this pull request Oct 30, 2024
* Backport PR #1729: (feat): support ellipsis indexing

* (fix): ellipsis type

* (fix): patch versions?
@flying-sheep flying-sheep modified the milestones: 0.10.10, 0.11.0 Nov 5, 2024
@scverse scverse deleted a comment from lumberbot-app bot Nov 5, 2024
@flying-sheep
Copy link
Member

I changed the milestone since the relnote is feature and we soon release 0.11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support ellipsis indexing
2 participants