Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propose alternative chunk shape algorithm #996

Draft
wants to merge 1 commit into
base: dev
Choose a base branch
from
Draft

Conversation

bendichter
Copy link
Contributor

Motivation

Fix #995

How to test the behavior?

Show how to reproduce the new behavior (can be a bug fix or a new feature)

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Have you checked our Contributing document?
  • Have you ensured the PR clearly describes the problem and the solution?
  • Is your contribution compliant with our coding style? This can be checked running ruff from the source directory.
  • Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

@bendichter bendichter changed the title propose alternative chunk shaoe algorithm propose alternative chunk shape algorithm Nov 8, 2023
return None


def array_with_desired_product(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the intent is for this to be used to determine chunk shape, I think naming the function in a way that describes what it is used for may be more intuitive for users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually prefer it this way: separating the math from the application. What we want here is a vector that has a product that is near to a target and has minimal sum, given constraints on the vector. This function is then called in an effort to determine the shape of a chunk.

@@ -12,6 +12,92 @@
from .utils import docval, getargs, popargs, docval_macro, get_data_shape


def find_nth_none(lst, n):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function intended for external use? If we will only use this for array_with_desired_product method, then I would suggest to make the method private.

Suggested change
def find_nth_none(lst, n):
def __find_nth_none(lst, n):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, both of these should be private, though @CodyCBakerPhD 's solution makes this function unnecessary.

@oruebel
Copy link
Contributor

oruebel commented Nov 8, 2023

In general, I think this is reasonable. @CodyCBakerPhD can you also take a look.

@bendichter
Copy link
Contributor Author

Another approach could be to work with the prime factorization of the residual product of the size. That would get us closer to the desired chunk size but we could end up with very uneven shapes

@oruebel
Copy link
Contributor

oruebel commented Nov 8, 2023

Another approach could be to work with the prime factorization of the residual product of the size. That would get us closer to the desired chunk size but we could end up with very uneven shapes

Uneven shape is not necessarily bad, but I think you probably want the ratio to be informed by the overall size of the dataset. I.e., for (time, electrodes) where time is much longer than electrodes, then you'd also want the chunks to be larger in the time dimension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Define partial chunk shape for GenericDataChunkIterator
2 participants