[FB] [PI-3478] Lenient cube arithmetic #3774

bjlittle · 2020-08-05T13:21:23Z

This PR fixes (pretty much) everything that is very, very wrong with cube arithmetic.

It builds on PR #3583 and PR #3739, and is related to the parent issue #3478.

TODO:

Required to git cherry-pick the following PRs from master to resolve known travis-ci test failure issues:

Numpy rounding fix #3758 (Jul 22)

The following tasks are out of scope for this particular PR, but will be addressed immediately afterwards. This approach is being taken in order to expedite this PR and unblock other critical PRs required for the imminent iris v3.0.0 release.

This suggested tasks to perform, in order are as follows:

Tasks pegged as future work. To be raised as separate individual issues and tackled for iris 3.1:

Raise multiple coordinate points/bounds mismatches as a single exception
- See Cube arithmetic points/bounds mismatch errors #3776

ensure rounding is numpy like (maintains type)

lbdreyer

I am by no means done reviewing (!) I just thought there were some comments I wanted raise sooner rather than later to get the conversations started now, particularly the comment about rhs/lhs and src/tgt as that is really confusing me!

lbdreyer · 2020-08-10T12:15:07Z

lib/iris/analysis/maths.py

 import numpy as np
 from numpy import ma

 import iris.analysis
+from iris.common import SERVICES, Resolve
+from iris.common.lenient import _lenient_client as lenient_client


I'm unsure what the consequences of this. By importing _lenient client as lenient client, does this technically expose it as part of the public api? i.e. via iris.analysis.maths.lenient_client?

@lbdreyer Yup, it's probably best to keep this as _lenient_client

lbdreyer · 2020-08-10T13:51:02Z

lib/iris/analysis/maths.py

@@ -838,39 +730,59 @@ def _binary_op_common(
                           `cube` and `cube.data`
    """
    _assert_is_cube(cube)
+
+    skeleton = False


Could you add a comment here explaining what you mean by skeleton?

@lbdreyer As per the description task list, I've not finished adding comments or doc-strings, but yes, happy to add a comment for this

lbdreyer · 2020-08-10T13:55:29Z

lib/iris/analysis/maths.py

@@ -915,26 +827,57 @@ def _broadcast_cube_coord_data(cube, other, operation_name, dim=None):
    return points


+def _sanitise_metadata(cube, unit):


This seems like quite a general function that could be useful elsewhere for purposes other than just maths. I'm worried if it was hidden away here, we may forget it's here. Perhaps you could move it to iris.common.metadata?

@lbdreyer Hmmm not sure.

It does seem like a general function, but at the moment it's very specific to the metadata contract of maths, so I'm inclined to keep it as it is, and not expose it to the public API.

lbdreyer · 2020-08-10T15:49:52Z

lib/iris/common/resolve.py

+from iris.common import LENIENT
+
+
+__all__ = ["Resolve"]


I find this module to be a bit obscure. The name is not descriptive (iris.common.resolve.Resolve could really refer to anything) and there is no module docstring or docstring for the Resolve class. It requires reading a fair amount of the code before you can deduce what the intention behind it is.
My suggestion (in order of preference):

Move the Resolve class to the iris.common.metada. Afterall it is about resolving metadata, so it makes sense to live there. Does it justify its own module

Rename this module to iris.common.resolve_metadata and add a module docstring.

@lbdreyer For me, this class doesn't belong in iris.common.metadata.

It's certainly using iris.common.metadata but it's primarily dealing with higher order objects, such as cubes, coordinates and factories. So I believe it stands in its own right.

Naming things can be the hardest part, so I'm open to suggestions, but I'd be keen to steer clear of incorporating the word metadata in the module or class name... as a term I think the use of metadata is already overloaded within iris.

Hmmm, I'll have a think...

Also, as I said previously, I've not finished fully commenting the code or adding doc-strings, so that will come.

lbdreyer · 2020-08-10T15:51:40Z

lib/iris/common/resolve.py

+            self(lhs, rhs)
+
+    def __call__(self, lhs, rhs):
+        self._init(lhs, rhs)


I don't understand the reason behind the added complexity of including a init, call and _init
when you only ever do instantiate it in iris.analysis.math._binary_op_common:

iris/lib/iris/analysis/maths.py

Lines 739 to 742 in 14a3bfb

elif isinstance(other, iris.cube.Cube):

# prepare to resolve the cube operands and associated coordinate

# metadata into the resultant cube.

resolve = Resolve(cube, other)

So why not just have a init solely?

@lbdreyer I don't want to restrict a Resolve instance to being only bound to the cubes that it's provided at instantiation.

I also want to support the pattern of creating a single resolve instance and calling it multiple times e.g.,

result = Resolve(lhs, rhs).cube(data)

and

resolver = Resolve() for ...: result = resolver(lhs, rhs).cube(data)

lbdreyer · 2020-08-10T16:01:27Z

lib/iris/common/resolve.py

+            items_dim=[], items_aux=[], items_scalar=[]
+        )
+        self.category_common = _CategoryItems(
+            items_dim=[], items_aux=[], items_scalar=[]


Why not just do this in the self._init function where you originally set these to None?

@lbdreyer Yup, I could do... but I was adopting the style of declaring all the public and private members in the Resolve._init and configuring them within the context of where they are populated

lbdreyer · 2020-08-10T16:31:28Z

lib/iris/common/resolve.py

+        coordinates common to both cubes.
+
+        This is only applicable to coordinates that are members of the
+        'aux_coords' or dim_coords' of the participating cubes.


missing an extra quote

Suggested change

'aux_coords' or dim_coords' of the participating cubes.

'aux_coords' or 'dim_coords' of the participating cubes.

@lbdreyer Nice spot 👀

lbdreyer · 2020-08-10T21:24:31Z

lib/iris/common/resolve.py

+
+        # Map RHS cube to LHS cube, or smaller to larger cube rank.
+        if self.map_rhs_to_lhs:
+            src_cube = self.rhs_cube


I'm finding the lhs/rhs and src/tgt thing really hard to follow. It doesn't seem consistent?
Here if self.map_rhs_to_lhs is True, then src=rhs and tgt=lhs. However, in self._metadata_resolve, on lines 783-789, if self.map_rhs_to_lhs is True, you set the args such that src=lhs and tgt=rhs

As a suggestion, why not set lhs/rhs as src/tgt once, in the init, then you only have to worry about src/tgt and no longer have to keep checking self.map_rhs_to_lhs

@lbdreyer That's understandable. I totally get that...

But these concepts are pretty core to Resolve, and it took me a while to settle on the best way to do this. But there are two important concepts at play here, that are subtly related, but different.

Essentially, the lhs and rhs refer to the cube operands, and these never change. They bind to the order of the cubes that are presented to Resolve.__init__ or Resolve.__call__. In the context of cube arithmetic, these are the lhs and the rhs operands of the binary arithmetic operation. It's important to maintain the order of the cubes that are given to Resolve, and not swap them around, ever.

During the process of preparing to reconcile or resolve the lhs and rhs cubes into a single resultant cube, one cube must map to the other i.e., you could think of it as, one of the cubes will be the target for an in-place operation. Typically, it's the cube with the higher dimensionality, in that case that's the tgt and the other is the src, but this may change due to broadcasting and free dimensions (i.e., anonymous dimensions with no coordinate metadata describing them).

Also, this situation gets more involved when the lhs and rhs have the same dimensionality... the point is, the src and tgt are not akin to the lhs and rhs, as the mapping of src to tgt can change depending on several factors which are determined at runtime as the lhs and rhs are compared and analysed.

Fundamentally, the use of src and tgt, and it's ability to change, is paramount to provisioning commutative results.

We should talk more about this, to help clarify... also more doc-strings and comments would help.

@lbdreyer Also, the Resolve.mapping, is tightly coupled to the concept of src and tgt, where the Resolve.mapping.keys() are the src cube dimensions, and the Resolve.mapping.values() are the tgt cube dimensions, always.
The freedom and flexibility to invert this Resolve.mapping relationship, which is required at appropriate times, is totally independent of the lhs and rhs original operation ordering.
That's a simple, yet important concept, that makes quite complex behaviour simply possible.

lbdreyer · 2020-08-10T23:25:40Z

lib/iris/common/resolve.py

+from iris.common import LENIENT
+
+
+__all__ = ["Resolve"]


Why is this part of the public API? Why not just make it private for now?

@lbdreyer Simply because Resolve deprecates other functionality within iris, hence it requires to be part of the public API.

Also, its functionality is quite generic and not necessarily tied to cube arithmetic. It provides a very powerful way to combine two cubes into one resultant cube.

bjlittle · 2020-08-12T23:39:27Z

lib/iris/analysis/maths.py

    )


 def apply_ufunc(
-    ufunc, cube, other_cube=None, new_unit=None, new_name=None, in_place=False
+    ufunc, cube, other=None, new_unit=None, new_name=None, in_place=False


This is a breaking change to the API for apply_ufunc, but it is completely warranted, as changing the kwarg from other_cube to other aligns this with the rest of the iris.analysis.maths public API.

Consequently, there will be an appropriate entry in the whatsnew for the community for iris 3.0.0

bjlittle · 2020-08-13T01:22:46Z

lib/iris/analysis/maths.py

+                    f"provided data_func {self._data_func_name!r} only requires "
+                    "1 input"
+                )
+                logger.debug(dmsg)


Create a logging debug message rather than raising a ValueError, which now aligns the IFunc.__call__ behaviour for a data_func.nin == 1 to that of apply_ufunc with a ufunc.nin == 1.
This will have an appropriate whatsnew entry for the community.

abooton · 2020-08-13T18:46:12Z

@ldreyer has kindly carried out the review of the changes in this PR, and so I am merging this PR into the cube-arithmetic feature branch, with the caveat that the outstanding items (list in PR description) are to be addressed prior to the Iris3.0 release.

* initial cube arithmetic * support in-place cube resolve * fix non in-place broadcasting * remove temporary resolve scenario test * lenient/strict support for attributes dicts with numpy arrays * lenient/strict treatment of scalar coordinates * strict points/bounds matching * lenient/strict prepare local dim/aux/scalar coordinates * support extended broadcasting * always raise exception on points/bounds mismatch * ignore scalar points/bounds mismatches, lenient only * remove todos * tidy logger debugs * qualify src/tgt cube references in debug * Numpy rounding fix (SciTools#3758) ensure rounding is numpy like (maintains type) * avoid unittest.mock.sentinel copy issue * fast load np.int32 * fix cube maths doctest * fix iris.common.resolve logging configuration * fix prepare points/bounds + extra metadata cml * support mapping reversal based on free dims * var_name fix for lenient equality * add support for DimCoordMetadata * fix circular flag + support CoordMetadata and DimCoordMetadata exchange * fix circular issue for concatenate DimCoord->AuxCoord demotion * fix concatenate._CubeSignature sorted * minor tweaks * keep lenient_client private in maths * tidy maths * tidy iris.analysis.maths.IFunc * refactor IFunc test * polish in-place support * tidy metadata_resolve Co-authored-by: stephenworsley <[email protected]>

* PI-3478: Common metadata API (#3583) * common metadata api * rationalise _cube_coord_common into common * move state into metadata * MetadataFactory test coverage * temporarily pin back iris-grib * test coverage for iris.common.metadata._BaseMeta * test coverage for iris.common.mixin.LimitedAttributeDict * remove temporary iris-grib pin * review actions * Update lib/iris/tests/unit/common/metadata/test_BaseMetadata.py Co-Authored-By: lbdreyer <[email protected]> * [FB] [PI-3478] Lenient metadata (#3739) * add lenient infra-structure * add metadata lenient __eq__ support * complete __eq__, combine and difference support * explicit inherited lenient_service + support equal convenience * fix attributes difference + lenient kwargs * make lenient public + minor tidy * rename MetadataManagerFactory to metadata_manager_factory * extend lenient_client decorator to support services registration * add lenient test coverage * purge qualname usage in metadata.py * support global enable for lenient services * support partial mapping metadata assignment * purge Lenient.__setattr__ from api * add BaseMetadata compare test coverage * metadata rationalisation * add BaseMetadata difference test coverage * added context manager ephemeral comment clarification * add BaseMetadata __ne__ test coverage * standardise lenient decorator closure names * add BaseMetadata equal test coverage * half dunder context * add AncillaryVariableMetadata test coverage * add additional AncillaryVariableMetadata test coverage * add CellMeasureMetadata test coverage * Clarify lenient_service operation + simplify code. * add CoordMetadata test coverage * add CubeMetadata test coverage * metadata tests use self.cls * fix typo * fix context manager ephemeral services * add logging * Pin pillow to make graphics tests work again. (#3630) * Fixed tests since Numpy 1.18 deprecation of non-int num arguments for linspace. (#3655) * Switched use of datetime.weekday() to datetime.dayofwk. (#3687) * New image hashes for mpl 3x2 (#3682) * New image hash for iris.test.test_plot.TestSymbols.test_cloud_cover with matplotlib 3.2.0. * Further images changes for mpl3x2. * Yet more updated image results. * fix sentinel uniqueness test failure * remove redundant cdm mapping test * difference returns None for no difference * protect Lenient and LENIENT private * privitise lenient framework and add API veneer * add explicit maths feature default * review actions * review actions * trexfeathers review actions * stephenworsley review actions Co-authored-by: Patrick Peglar <[email protected]> Co-authored-by: Martin Yeo <[email protected]> * [FB] [PI-3478] Lenient cube arithmetic (#3774) * initial cube arithmetic * support in-place cube resolve * fix non in-place broadcasting * remove temporary resolve scenario test * lenient/strict support for attributes dicts with numpy arrays * lenient/strict treatment of scalar coordinates * strict points/bounds matching * lenient/strict prepare local dim/aux/scalar coordinates * support extended broadcasting * always raise exception on points/bounds mismatch * ignore scalar points/bounds mismatches, lenient only * remove todos * tidy logger debugs * qualify src/tgt cube references in debug * Numpy rounding fix (#3758) ensure rounding is numpy like (maintains type) * avoid unittest.mock.sentinel copy issue * fast load np.int32 * fix cube maths doctest * fix iris.common.resolve logging configuration * fix prepare points/bounds + extra metadata cml * support mapping reversal based on free dims * var_name fix for lenient equality * add support for DimCoordMetadata * fix circular flag + support CoordMetadata and DimCoordMetadata exchange * fix circular issue for concatenate DimCoord->AuxCoord demotion * fix concatenate._CubeSignature sorted * minor tweaks * keep lenient_client private in maths * tidy maths * tidy iris.analysis.maths.IFunc * refactor IFunc test * polish in-place support * tidy metadata_resolve Co-authored-by: stephenworsley <[email protected]> * rebase master fix-up for cube arithmetic * add missing new dependency to readthedocs.yml requirements Co-authored-by: lbdreyer <[email protected]> Co-authored-by: Patrick Peglar <[email protected]> Co-authored-by: Martin Yeo <[email protected]> Co-authored-by: stephenworsley <[email protected]>

bjlittle added 12 commits July 17, 2020 00:29

initial cube arithmetic

b723b92

support in-place cube resolve

03d84de

fix non in-place broadcasting

17c52f0

remove temporary resolve scenario test

3dd72c8

lenient/strict support for attributes dicts with numpy arrays

1d43315

lenient/strict treatment of scalar coordinates

197fea0

strict points/bounds matching

26934f0

lenient/strict prepare local dim/aux/scalar coordinates

d658619

support extended broadcasting

98f7875

always raise exception on points/bounds mismatch

389d617

ignore scalar points/bounds mismatches, lenient only

407f301

remove todos

0b64a7f

bjlittle added Type: Enhancement Experience: High Release: Major Type: Infrastructure Type: Technical Debt Feature: Arithmetic labels Aug 5, 2020

bjlittle added this to the v3.0.0 milestone Aug 5, 2020

lbdreyer mentioned this pull request Aug 5, 2020

[PI] Implement new cube arithmetic proposal #3478

Closed

53 tasks

bjlittle and others added 4 commits August 5, 2020 22:24

tidy logger debugs

f9f4a2e

qualify src/tgt cube references in debug

7a74615

Numpy rounding fix (SciTools#3758)

d56650d

ensure rounding is numpy like (maintains type)

avoid unittest.mock.sentinel copy issue

52a39e9

bjlittle force-pushed the lenient-metadata-maths branch from 590e62f to 52a39e9 Compare August 5, 2020 23:01

bjlittle added 3 commits August 6, 2020 00:43

fast load np.int32

b143e4c

fix cube maths doctest

c9156fe

fix iris.common.resolve logging configuration

2f80637

var_name fix for lenient equality

14a3bfb

bjlittle assigned lbdreyer Aug 10, 2020

lbdreyer reviewed Aug 10, 2020

View reviewed changes

bjlittle added 4 commits August 11, 2020 02:05

add support for DimCoordMetadata

d19a7f3

fix circular flag + support CoordMetadata and DimCoordMetadata exchange

7ca2223

fix circular issue for concatenate DimCoord->AuxCoord demotion

2eada82

fix concatenate._CubeSignature sorted

1c25997

bjlittle marked this pull request as ready for review August 12, 2020 20:28

bjlittle added 3 commits August 12, 2020 22:10

minor tweaks

fe98ed0

keep lenient_client private in maths

246921b

tidy maths

8e564fa

bjlittle commented Aug 12, 2020

View reviewed changes

tidy iris.analysis.maths.IFunc

99852d2

bjlittle commented Aug 13, 2020

View reviewed changes

bjlittle added 2 commits August 13, 2020 02:45

refactor IFunc test

f3e682e

polish in-place support

4ae36ae

bjlittle force-pushed the lenient-metadata-maths branch from 2856eb9 to 4ae36ae Compare August 13, 2020 11:57

tidy metadata_resolve

6720540

abooton merged commit 0e15178 into SciTools:cube-arithmetic Aug 13, 2020

bjlittle mentioned this pull request Aug 14, 2020

[PI-3478] Merge cube arithmetic feature branch #3785

Merged

This was referenced Aug 17, 2020

Cube broadcasting for addition/subtraction. #1887

Closed

Array broadcasting issues #2765

Closed

Lenient cube concatenation, merge and maths #1987

Open

bjlittle deleted the lenient-metadata-maths branch February 15, 2021 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FB] [PI-3478] Lenient cube arithmetic #3774

[FB] [PI-3478] Lenient cube arithmetic #3774

bjlittle commented Aug 5, 2020 •

edited

Loading

lbdreyer left a comment •

edited

Loading

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020

bjlittle Aug 12, 2020

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020

bjlittle Aug 12, 2020

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020 •

edited

Loading

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020 •

edited

Loading

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020

bjlittle Aug 13, 2020

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020

bjlittle Aug 12, 2020

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020 •

edited

Loading

bjlittle Aug 11, 2020

lbdreyer Aug 10, 2020

bjlittle Aug 11, 2020 •

edited

Loading

bjlittle Aug 12, 2020 •

edited

Loading

bjlittle Aug 13, 2020

abooton commented Aug 13, 2020

		@@ -915,26 +827,57 @@ def _broadcast_cube_coord_data(cube, other, operation_name, dim=None):
		return points


		def _sanitise_metadata(cube, unit):

	elif isinstance(other, iris.cube.Cube):
	# prepare to resolve the cube operands and associated coordinate
	# metadata into the resultant cube.
	resolve = Resolve(cube, other)

	'aux_coords' or dim_coords' of the participating cubes.
	'aux_coords' or 'dim_coords' of the participating cubes.

[FB] [PI-3478] Lenient cube arithmetic #3774

[FB] [PI-3478] Lenient cube arithmetic #3774

Conversation

bjlittle commented Aug 5, 2020 • edited Loading

lbdreyer left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjlittle Aug 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjlittle Aug 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjlittle Aug 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjlittle Aug 11, 2020 • edited Loading

Choose a reason for hiding this comment

bjlittle Aug 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abooton commented Aug 13, 2020

bjlittle commented Aug 5, 2020 •

edited

Loading

lbdreyer left a comment •

edited

Loading

bjlittle Aug 11, 2020 •

edited

Loading

bjlittle Aug 11, 2020 •

edited

Loading

bjlittle Aug 11, 2020 •

edited

Loading

bjlittle Aug 11, 2020 •

edited

Loading

bjlittle Aug 12, 2020 •

edited

Loading