-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block-mapped resample with the help of flox #1848
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
This reverts commit a7e5bde.
This reverts commit 9778739.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm far from an expert here, but the implementation looks good!
|
||
Returns | ||
------- | ||
clean_obj : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any?
try: | ||
import flox | ||
except ImportError: | ||
flox = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better if we were adding flox
to the tox
configuration, then parametrizing
the tests to use flox
if it's there? I believe we have one build that tests the extras
recipe already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but my current "testing" only consist in hijacking an existing test and enabling the new option in a certain case. A importorskip('flox')
would be much better, but I'll do that on a test specific to resample_map
. To come in the next working days...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that sounds fine to me! Just keep it in mind.
Co-authored-by: Trevor James Smith <[email protected]>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an alternative suggestion here (and would save me from opening a PR):
- Add
flox
(with a baseline pin version) to theextras
recipe. - Add
flox
andpot >=0.9.4
to theenvironment.yml
. - Add
fastnanquantile >=0.0.2
to theenvironment.yml
section for pip dependencies. - Remove
flox
from the DEP001 list.
Pull Request Checklist:
number
) and pull request (:pull:number
) has been addedWhat kind of change does this PR introduce?
Implements
resample_map
. This function is meant for allda.resample(...).map(...)
calls. These,flox
cannot improve automatically so we use some flox logic to help. The idea is to map the resample-map construct on each block in parallel. This is possible by first rechunking the array so that chunks boundary fit with resampling period boundaries (this is a flox function).The main improvement should come from the fact that
map_blocks
hides much of the complexity todask
, so the resulting graph is much lighter. I still have to better test the performance of this. My goal would be to have some short text in xclim's doc that highlights when the option is useful and when it is not. The option is activated throughset_options
.The current function works only when the input object is of the same type as the output one. So some functions couldn't be wrapped with this yet. The most important untouched code for the moment is the missing checks where I think this could help a lot.
Does this PR introduce a breaking change?
It should not. This is completely optional.
Other information:
In progress, I still need to prove the performance boost.
This depends on #1845 because I need all improvements for PC.