-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FB] [PI-3478] Lenient cube arithmetic #3774
[FB] [PI-3478] Lenient cube arithmetic #3774
Conversation
ensure rounding is numpy like (maintains type)
590e62f
to
52a39e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am by no means done reviewing (!) I just thought there were some comments I wanted raise sooner rather than later to get the conversations started now, particularly the comment about rhs/lhs and src/tgt as that is really confusing me!
lib/iris/analysis/maths.py
Outdated
import numpy as np | ||
from numpy import ma | ||
|
||
import iris.analysis | ||
from iris.common import SERVICES, Resolve | ||
from iris.common.lenient import _lenient_client as lenient_client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure what the consequences of this. By importing _lenient client as lenient client,
does this technically expose it as part of the public api? i.e. via iris.analysis.maths.lenient_client
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Yup, it's probably best to keep this as _lenient_client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
lib/iris/analysis/maths.py
Outdated
@@ -838,39 +730,59 @@ def _binary_op_common( | |||
`cube` and `cube.data` | |||
""" | |||
_assert_is_cube(cube) | |||
|
|||
skeleton = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment here explaining what you mean by skeleton?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer As per the description task list, I've not finished adding comments or doc-strings, but yes, happy to add a comment for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -915,26 +827,57 @@ def _broadcast_cube_coord_data(cube, other, operation_name, dim=None): | |||
return points | |||
|
|||
|
|||
def _sanitise_metadata(cube, unit): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like quite a general function that could be useful elsewhere for purposes other than just maths. I'm worried if it was hidden away here, we may forget it's here. Perhaps you could move it to iris.common.metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Hmmm not sure.
It does seem like a general function, but at the moment it's very specific to the metadata contract of maths, so I'm inclined to keep it as it is, and not expose it to the public API.
from iris.common import LENIENT | ||
|
||
|
||
__all__ = ["Resolve"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this module to be a bit obscure. The name is not descriptive (iris.common.resolve.Resolve could really refer to anything) and there is no module docstring or docstring for the Resolve class. It requires reading a fair amount of the code before you can deduce what the intention behind it is.
My suggestion (in order of preference):
- Move the Resolve class to the iris.common.metada. Afterall it is about resolving metadata, so it makes sense to live there. Does it justify its own module
- Rename this module to iris.common.resolve_metadata and add a module docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer For me, this class doesn't belong in iris.common.metadata
.
It's certainly using iris.common.metadata
but it's primarily dealing with higher order objects, such as cubes, coordinates and factories. So I believe it stands in its own right.
Naming things can be the hardest part, so I'm open to suggestions, but I'd be keen to steer clear of incorporating the word metadata
in the module or class name... as a term I think the use of metadata
is already overloaded within iris
.
Hmmm, I'll have a think...
Also, as I said previously, I've not finished fully commenting the code or adding doc-strings, so that will come.
self(lhs, rhs) | ||
|
||
def __call__(self, lhs, rhs): | ||
self._init(lhs, rhs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the reason behind the added complexity of including a init, call and _init
when you only ever do instantiate it in iris.analysis.math._binary_op_common:
iris/lib/iris/analysis/maths.py
Lines 739 to 742 in 14a3bfb
elif isinstance(other, iris.cube.Cube): | |
# prepare to resolve the cube operands and associated coordinate | |
# metadata into the resultant cube. | |
resolve = Resolve(cube, other) |
So why not just have a init solely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer I don't want to restrict a Resolve
instance to being only bound to the cubes that it's provided at instantiation.
I also want to support the pattern of creating a single resolve instance and calling it multiple times e.g.,
result = Resolve(lhs, rhs).cube(data)
and
resolver = Resolve()
for ...:
result = resolver(lhs, rhs).cube(data)
lib/iris/common/resolve.py
Outdated
items_dim=[], items_aux=[], items_scalar=[] | ||
) | ||
self.category_common = _CategoryItems( | ||
items_dim=[], items_aux=[], items_scalar=[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just do this in the self._init
function where you originally set these to None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Yup, I could do... but I was adopting the style of declaring all the public and private members in the Resolve._init
and configuring them within the context of where they are populated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/iris/common/resolve.py
Outdated
coordinates common to both cubes. | ||
|
||
This is only applicable to coordinates that are members of the | ||
'aux_coords' or dim_coords' of the participating cubes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing an extra quote
'aux_coords' or dim_coords' of the participating cubes. | |
'aux_coords' or 'dim_coords' of the participating cubes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Nice spot 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
# Map RHS cube to LHS cube, or smaller to larger cube rank. | ||
if self.map_rhs_to_lhs: | ||
src_cube = self.rhs_cube |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm finding the lhs/rhs and src/tgt thing really hard to follow. It doesn't seem consistent?
Here if self.map_rhs_to_lhs
is True, then src=rhs and tgt=lhs. However, in self._metadata_resolve
, on lines 783-789, if self.map_rhs_to_lhs
is True, you set the args such that src=lhs and tgt=rhs
As a suggestion, why not set lhs/rhs as src/tgt once, in the init, then you only have to worry about src/tgt and no longer have to keep checking self.map_rhs_to_lhs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer That's understandable. I totally get that...
But these concepts are pretty core to Resolve
, and it took me a while to settle on the best way to do this. But there are two important concepts at play here, that are subtly related, but different.
Essentially, the lhs
and rhs
refer to the cube operands, and these never change. They bind to the order of the cubes that are presented to Resolve.__init__
or Resolve.__call__
. In the context of cube arithmetic, these are the lhs
and the rhs
operands of the binary arithmetic operation. It's important to maintain the order of the cubes that are given to Resolve
, and not swap them around, ever.
During the process of preparing to reconcile or resolve the lhs
and rhs
cubes into a single resultant cube, one cube must map to the other i.e., you could think of it as, one of the cubes will be the target for an in-place operation. Typically, it's the cube with the higher dimensionality, in that case that's the tgt
and the other is the src
, but this may change due to broadcasting and free dimensions (i.e., anonymous dimensions with no coordinate metadata describing them).
Also, this situation gets more involved when the lhs
and rhs
have the same dimensionality... the point is, the src
and tgt
are not akin to the lhs
and rhs
, as the mapping of src
to tgt
can change depending on several factors which are determined at runtime as the lhs
and rhs
are compared and analysed.
Fundamentally, the use of src
and tgt
, and it's ability to change, is paramount to provisioning commutative results.
We should talk more about this, to help clarify... also more doc-strings and comments would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Also, the Resolve.mapping
, is tightly coupled to the concept of src
and tgt
, where the Resolve.mapping.keys()
are the src
cube dimensions, and the Resolve.mapping.values()
are the tgt
cube dimensions, always.
The freedom and flexibility to invert this Resolve.mapping
relationship, which is required at appropriate times, is totally independent of the lhs
and rhs
original operation ordering.
That's a simple, yet important concept, that makes quite complex behaviour simply possible.
from iris.common import LENIENT | ||
|
||
|
||
__all__ = ["Resolve"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this part of the public API? Why not just make it private for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Simply because Resolve
deprecates other functionality within iris
, hence it requires to be part of the public API.
Also, its functionality is quite generic and not necessarily tied to cube arithmetic. It provides a very powerful way to combine two cubes into one resultant cube.
) | ||
|
||
|
||
def apply_ufunc( | ||
ufunc, cube, other_cube=None, new_unit=None, new_name=None, in_place=False | ||
ufunc, cube, other=None, new_unit=None, new_name=None, in_place=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change to the API for apply_ufunc
, but it is completely warranted, as changing the kwarg from other_cube
to other
aligns this with the rest of the iris.analysis.maths
public API.
Consequently, there will be an appropriate entry in the whatsnew
for the community for iris
3.0.0
f"provided data_func {self._data_func_name!r} only requires " | ||
"1 input" | ||
) | ||
logger.debug(dmsg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a logging
debug message rather than raising a ValueError
, which now aligns the IFunc.__call__
behaviour for a data_func.nin == 1
to that of apply_ufunc
with a ufunc.nin == 1
.
This will have an appropriate whatsnew
entry for the community.
2856eb9
to
4ae36ae
Compare
@ldreyer has kindly carried out the review of the changes in this PR, and so I am merging this PR into the cube-arithmetic feature branch, with the caveat that the outstanding items (list in PR description) are to be addressed prior to the Iris3.0 release. |
* initial cube arithmetic * support in-place cube resolve * fix non in-place broadcasting * remove temporary resolve scenario test * lenient/strict support for attributes dicts with numpy arrays * lenient/strict treatment of scalar coordinates * strict points/bounds matching * lenient/strict prepare local dim/aux/scalar coordinates * support extended broadcasting * always raise exception on points/bounds mismatch * ignore scalar points/bounds mismatches, lenient only * remove todos * tidy logger debugs * qualify src/tgt cube references in debug * Numpy rounding fix (SciTools#3758) ensure rounding is numpy like (maintains type) * avoid unittest.mock.sentinel copy issue * fast load np.int32 * fix cube maths doctest * fix iris.common.resolve logging configuration * fix prepare points/bounds + extra metadata cml * support mapping reversal based on free dims * var_name fix for lenient equality * add support for DimCoordMetadata * fix circular flag + support CoordMetadata and DimCoordMetadata exchange * fix circular issue for concatenate DimCoord->AuxCoord demotion * fix concatenate._CubeSignature sorted * minor tweaks * keep lenient_client private in maths * tidy maths * tidy iris.analysis.maths.IFunc * refactor IFunc test * polish in-place support * tidy metadata_resolve Co-authored-by: stephenworsley <[email protected]>
* PI-3478: Common metadata API (#3583) * common metadata api * rationalise _cube_coord_common into common * move state into metadata * MetadataFactory test coverage * temporarily pin back iris-grib * test coverage for iris.common.metadata._BaseMeta * test coverage for iris.common.mixin.LimitedAttributeDict * remove temporary iris-grib pin * review actions * Update lib/iris/tests/unit/common/metadata/test_BaseMetadata.py Co-Authored-By: lbdreyer <[email protected]> * [FB] [PI-3478] Lenient metadata (#3739) * add lenient infra-structure * add metadata lenient __eq__ support * complete __eq__, combine and difference support * explicit inherited lenient_service + support equal convenience * fix attributes difference + lenient kwargs * make lenient public + minor tidy * rename MetadataManagerFactory to metadata_manager_factory * extend lenient_client decorator to support services registration * add lenient test coverage * purge qualname usage in metadata.py * support global enable for lenient services * support partial mapping metadata assignment * purge Lenient.__setattr__ from api * add BaseMetadata compare test coverage * metadata rationalisation * add BaseMetadata difference test coverage * added context manager ephemeral comment clarification * add BaseMetadata __ne__ test coverage * standardise lenient decorator closure names * add BaseMetadata equal test coverage * half dunder context * add AncillaryVariableMetadata test coverage * add additional AncillaryVariableMetadata test coverage * add CellMeasureMetadata test coverage * Clarify lenient_service operation + simplify code. * add CoordMetadata test coverage * add CubeMetadata test coverage * metadata tests use self.cls * fix typo * fix context manager ephemeral services * add logging * Pin pillow to make graphics tests work again. (#3630) * Fixed tests since Numpy 1.18 deprecation of non-int num arguments for linspace. (#3655) * Switched use of datetime.weekday() to datetime.dayofwk. (#3687) * New image hashes for mpl 3x2 (#3682) * New image hash for iris.test.test_plot.TestSymbols.test_cloud_cover with matplotlib 3.2.0. * Further images changes for mpl3x2. * Yet more updated image results. * fix sentinel uniqueness test failure * remove redundant cdm mapping test * difference returns None for no difference * protect Lenient and LENIENT private * privitise lenient framework and add API veneer * add explicit maths feature default * review actions * review actions * trexfeathers review actions * stephenworsley review actions Co-authored-by: Patrick Peglar <[email protected]> Co-authored-by: Martin Yeo <[email protected]> * [FB] [PI-3478] Lenient cube arithmetic (#3774) * initial cube arithmetic * support in-place cube resolve * fix non in-place broadcasting * remove temporary resolve scenario test * lenient/strict support for attributes dicts with numpy arrays * lenient/strict treatment of scalar coordinates * strict points/bounds matching * lenient/strict prepare local dim/aux/scalar coordinates * support extended broadcasting * always raise exception on points/bounds mismatch * ignore scalar points/bounds mismatches, lenient only * remove todos * tidy logger debugs * qualify src/tgt cube references in debug * Numpy rounding fix (#3758) ensure rounding is numpy like (maintains type) * avoid unittest.mock.sentinel copy issue * fast load np.int32 * fix cube maths doctest * fix iris.common.resolve logging configuration * fix prepare points/bounds + extra metadata cml * support mapping reversal based on free dims * var_name fix for lenient equality * add support for DimCoordMetadata * fix circular flag + support CoordMetadata and DimCoordMetadata exchange * fix circular issue for concatenate DimCoord->AuxCoord demotion * fix concatenate._CubeSignature sorted * minor tweaks * keep lenient_client private in maths * tidy maths * tidy iris.analysis.maths.IFunc * refactor IFunc test * polish in-place support * tidy metadata_resolve Co-authored-by: stephenworsley <[email protected]> * rebase master fix-up for cube arithmetic * add missing new dependency to readthedocs.yml requirements Co-authored-by: lbdreyer <[email protected]> Co-authored-by: Patrick Peglar <[email protected]> Co-authored-by: Martin Yeo <[email protected]> Co-authored-by: stephenworsley <[email protected]>
This PR fixes (pretty much) everything that is very, very wrong with cube arithmetic.
It builds on PR #3583 and PR #3739, and is related to the parent issue #3478.
TODO:
IFunc
apply_ufunc
in_place
mathscircular
issue 😬travis-ci
failuresvar_name
fix for lenient metadata equalityRequired to
git cherry-pick
the following PRs frommaster
to resolve knowntravis-ci
test failure issues:The following tasks are out of scope for this particular PR, but will be addressed immediately afterwards. This approach is being taken in order to expedite this PR and unblock other critical PRs required for the imminent
iris
v3.0.0 release.This suggested tasks to perform, in order are as follows:
iris.analysis.maths.apply_ufunc
kwargother_cube
->other
, see hereiris.analysis.maths.IFunc.__call__
demoteValueError
tologging
message, akin toapply_ufunc
fordata_func.nin == 1
, see heremapping
rationalisation during_metadata_prepare
dtype
promotionDimCoordMetadata
vsCoordMetadata
iris.util.array_equal
to support masked arraysndarray
vsMaskedArray
Tasks pegged as future work. To be raised as separate individual issues and tackled for
iris
3.1: