From 5b93be6e1ff0a75bcb5efa801c7ac45370f0548b Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Tue, 21 Nov 2023 14:50:59 +0000 Subject: [PATCH] Mergeback of "Feature _split_attrs" branch (#5152) * Split-attrs: Cube metadata refactortests (#4993) * Convert Test___eq__ to pytest. * Convert Test_combine to pytest. * Convert Test_difference to pytest. * Review changes. * Split attrs - tests for status quo (#4960) * Tests for attribute handling in netcdf load/save. * Tidy test functions. * Fix import order exception. * Add cf-global attributes test. * Towards more pytest-y implemenation. * Replace 'create_testcase' with fixture which also handles temporary directory. * Much tidy; use fixtures to parametrise over multiple attributes. * Fix warnings; begin data-style attrs tests. * Tests for data-style attributes. * Simplify setup fixture + improve docstring. * No parallel test runner, to avoid error for Python>3.8. * Fixed for new-style netcdf module. * Small review changes. * Rename attributes set 'data-style' as 'local-style'. * Simplify use of fixtures; clarify docstrings/comments and improve argument names. * Clarify testing sections for different attribute 'styles'. * Re-enable parallel testing. * Sorted params to avoid parallel testing bug - pytest#432. * Rename test functions to make alpha-order match order in class. * Split netcdf load/save attribute testing into separate sourcefile. * Add tests for loaded cube attributes; refactor to share code between Load and Roundtrip tests. * Add tests for attribute saving. * Fix method names in comments. * Clarify source of Conventions attributes. * Explain the test numbering in TestRoundtrip/TestLoad. * Remove obsolete test helper method. * Fix small typo; Fix numbering of testcases in TestSave. * Implement split cube attributes. (#5040) * Implement split cube attributes. * Test fixes. * Modify examples for simpler metadata printouts. * Added tests, small behaviour fixes. * Simplify copy. * Fix doctests. * Skip doctests with non-replicable outputs (from use of sets). * Tidy test comments, and add extra test. * Tiny typo. * Remove redundant redefinition of Cube.attributes. * Add CubeAttrsDict in module __all__ + improve docs coverage. * Review changes - small test changes. * More review changes. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix CubeAttrsDict example docstrings. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Odd small fixes. * Improved docstrings and comments; fix doctests. * Don't sidestep netcdf4 thread-safety. * Publicise LimitedAttributeDict, so CubeAttrsDict can refer to it. * Fix various internal + external links. * Update lib/iris/cube.py Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * Update lib/iris/cube.py Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * Update lib/iris/cube.py Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * Update lib/iris/cube.py Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * Streamline docs. * Review changes. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * Splitattrs ncload (#5384) * Distinguish local+global attributes in netcdf loads. * Small test fixes. * Small doctest fix. * Fix attribute load-save tests for new behaviour, and old-behaviour equivalence. * Split attrs docs (#5418) * Clarification in CubeAttrsDict examples. * CubeAttrsDict fix docstring typo. * Raise awareness of split attributes in user guide. * What's New entry. * Changes to metadata documentation. * Splitattrs ncsave redo (#5410) * Add docs and future switch, no function yet. * Typing enables code completion for Cube.attributes. * Make roundtrip checking more precise + improve some tests accordingly (cf. https://github.com/SciTools/iris/pull/5403). * Rework all tests to use common setup + results-checking code. * Saver supports split-attributes saving (no tests yet). * Tiny docs fix. * Explain test routines better. * Fix init of FUTURE object. * Remove spurious re-test of FUTURE.save_split_attrs. * Don't create Cube attrs of 'None' (n.b. but no effect as currently used). * Remove/repair refs to obsolete routines. * Check all warnings from save operations. * Remove TestSave test numbers. * More save cases: no match with missing, and different cube attribute types. * Run save/roundtrip tests both with+without split saves. * Fix. * Review changes. * Fix changed warning messages. * Move warnings checking from 'run' to 'check' phase. * Simplify and improve warnings checking code. * Fix wrong testcase. * Minor review changes. * Fix reverted code. * Use sets to simplify demoted-attributes code. * WIP * Working with iris 3.6.1, no errors TestSave or TestRoundtrip. * Interim save (incomplete?). * Different results form for split tests; working for roundtrip. * Check that all param lists are sorted. * Check matrix result-files compatibility; add test_save_matrix. * test_load_matrix added; two types of load result. * Finalise special-case attributes. * Small docs tweaks. * Add some more testcases, * Ensure valid sort-order for globals of possibly different types. * Initialise matrix results with legacy values from v3.6.1 -- all matching. * Add full current matrix results, i.e. snapshot current behaviours. * Review changes : rename some matrix testcases, for clarity. * Splitattrs ncsave redo commonmeta (#5538) * Define common-metadata operartions on split attribute dictionaries. * Tests for split-attributes handling in CubeMetadata operations. * Small tidy and clarify. * Common metadata ops support mixed split/unsplit attribute dicts. * Clarify with better naming, comments, docstrings. * Remove split-attrs handling to own sourcefile, and implement as a decorator. * Remove redundant tests duplicated by matrix testcases. * Newstyle split-attrs matrix testing, with fewer testcases. * Small improvements to comments + docstrings. * Fix logic for equals expectation; expand primary/secondary independence test. * Clarify result testing in metadata operations decorator. * Splitattrs equalise (#5586) * Add tests in advance for split-attributes handling cases. * Move dict conversion inside utility, for use elsewhere. * Add support for split-attributes to equalise_attributes. * Update lib/iris/util.py Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * Update lib/iris/tests/unit/util/test_equalise_attributes.py Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * Simplify and clarify equalise_attributes code. --------- Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * Fix merge-fail messaging for attribute mismatches. (#5590) * Extra CubeAttrsDict methods to emulate dictionary behaviours. (#5592) * Extra CubeAttrsDict methods to emulate dictionary behaviours. * Don't use staticmethod on fixture. * Add Iris warning categories to saver warnings. * Type equality fixes for new flake8. * Licence header fixes. * Splitattrs ncsave deprecation (#5595) * Small improvement to split-attrs whatsnew. * Emit deprecation warning when saving without split-attrs enabled. * Stop legacy-split-attribute warnings from upsetting delayed-saving tests. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> --- docs/src/further_topics/metadata.rst | 41 +- docs/src/userguide/iris_cubes.rst | 5 +- docs/src/whatsnew/latest.rst | 14 +- lib/iris/__init__.py | 17 +- lib/iris/_merge.py | 23 +- lib/iris/common/_split_attribute_dicts.py | 125 ++ lib/iris/common/metadata.py | 47 +- lib/iris/common/mixin.py | 33 +- lib/iris/cube.py | 380 +++- .../fileformats/_nc_load_rules/helpers.py | 4 +- lib/iris/fileformats/netcdf/loader.py | 7 +- lib/iris/fileformats/netcdf/saver.py | 224 ++- .../attrs_matrix_results_load.json | 1019 ++++++++++ .../attrs_matrix_results_roundtrip.json | 983 ++++++++++ .../attrs_matrix_results_save.json | 983 ++++++++++ .../integration/netcdf/test_delayed_save.py | 7 + .../integration/test_netcdf__loadsaveattrs.py | 1678 +++++++++++++++++ lib/iris/tests/test_merge.py | 82 + .../unit/common/metadata/test_CubeMetadata.py | 1193 +++++++----- .../common/mixin/test_LimitedAttributeDict.py | 2 +- lib/iris/tests/unit/cube/test_Cube.py | 28 +- .../tests/unit/cube/test_CubeAttrsDict.py | 407 ++++ .../helpers/test_build_cube_metadata.py | 9 +- .../unit/util/test_equalise_attributes.py | 113 +- lib/iris/util.py | 45 +- 25 files changed, 6945 insertions(+), 524 deletions(-) create mode 100644 lib/iris/common/_split_attribute_dicts.py create mode 100644 lib/iris/tests/integration/attrs_matrix_results_load.json create mode 100644 lib/iris/tests/integration/attrs_matrix_results_roundtrip.json create mode 100644 lib/iris/tests/integration/attrs_matrix_results_save.json create mode 100644 lib/iris/tests/integration/test_netcdf__loadsaveattrs.py create mode 100644 lib/iris/tests/unit/cube/test_CubeAttrsDict.py diff --git a/docs/src/further_topics/metadata.rst b/docs/src/further_topics/metadata.rst index a564b2ba68..10efcdf7fe 100644 --- a/docs/src/further_topics/metadata.rst +++ b/docs/src/further_topics/metadata.rst @@ -91,6 +91,16 @@ actual `data attribute`_ names of the metadata members on the Iris class. metadata members are Iris specific terms, rather than recognised `CF Conventions`_ terms. +.. note:: + + :class:`~iris.cube.Cube` :attr:`~iris.cube.Cube.attributes` implement the + concept of dataset-level and variable-level attributes, to enable correct + NetCDF loading and saving (see :class:`~iris.cube.CubeAttrsDict` and NetCDF + :func:`~iris.fileformats.netcdf.saver.save` for more). ``attributes`` on + the other classes do not have this distinction, but the ``attributes`` + members of ALL the classes still have the same interface, and can be + compared. + Common Metadata API =================== @@ -128,10 +138,12 @@ For example, given the following :class:`~iris.cube.Cube`, source 'Data from Met Office Unified Model 6.05' We can easily get all of the associated metadata of the :class:`~iris.cube.Cube` -using the ``metadata`` property: +using the ``metadata`` property (note the specialised +:class:`~iris.cube.CubeAttrsDict` for the :attr:`~iris.cube.Cube.attributes`, +as mentioned earlier): >>> cube.metadata - CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)) + CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes=CubeAttrsDict(globals={'Conventions': 'CF-1.5'}, locals={'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}), cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)) We can also inspect the ``metadata`` of the ``longitude`` :class:`~iris.coords.DimCoord` attached to the :class:`~iris.cube.Cube` in the same way: @@ -675,8 +687,8 @@ For example, consider the following :class:`~iris.common.metadata.CubeMetadata`, .. doctest:: metadata-combine - >>> cube.metadata # doctest: +SKIP - CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)) + >>> cube.metadata + CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes=CubeAttrsDict(globals={'Conventions': 'CF-1.5'}, locals={'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}), cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)) We can perform the **identity function** by comparing the metadata with itself, @@ -701,7 +713,7 @@ which is replaced with a **different value**, >>> metadata != cube.metadata True >>> metadata.combine(cube.metadata) # doctest: +SKIP - CubeMetadata(standard_name=None, long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'STASH': STASH(model=1, section=3, item=236), 'source': 'Data from Met Office Unified Model 6.05', 'Model scenario': 'A1B', 'Conventions': 'CF-1.5'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)) + CubeMetadata(standard_name=None, long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05', 'Conventions': 'CF-1.5'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)) The ``combine`` method combines metadata by performing a **strict** comparison between each of the associated metadata member values, @@ -724,7 +736,7 @@ Let's reinforce this behaviour, but this time by combining metadata where the >>> metadata != cube.metadata True >>> metadata.combine(cube.metadata).attributes - {'Model scenario': 'A1B'} + CubeAttrsDict(globals={}, locals={'Model scenario': 'A1B'}) The combined result for the ``attributes`` member only contains those **common keys** with **common values**. @@ -810,16 +822,17 @@ the ``from_metadata`` class method. For example, given the following .. doctest:: metadata-convert - >>> cube.metadata # doctest: +SKIP - CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)) + >>> cube.metadata + CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes=CubeAttrsDict(globals={'Conventions': 'CF-1.5'}, locals={'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}), cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)) We can easily convert it to a :class:`~iris.common.metadata.DimCoordMetadata` instance using ``from_metadata``, .. doctest:: metadata-convert - >>> DimCoordMetadata.from_metadata(cube.metadata) # doctest: +SKIP - DimCoordMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, coord_system=None, climatological=None, circular=None) + >>> newmeta = DimCoordMetadata.from_metadata(cube.metadata) + >>> print(newmeta) + DimCoordMetadata(standard_name=air_temperature, var_name=air_temperature, units=K, attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}) By examining :numref:`metadata members table`, we can see that the :class:`~iris.cube.Cube` and :class:`~iris.coords.DimCoord` container @@ -849,9 +862,9 @@ class instance, .. doctest:: metadata-convert - >>> longitude.metadata.from_metadata(cube.metadata) - DimCoordMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, coord_system=None, climatological=None, circular=None) - + >>> newmeta = longitude.metadata.from_metadata(cube.metadata) + >>> print(newmeta) + DimCoordMetadata(standard_name=air_temperature, var_name=air_temperature, units=K, attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}) .. _metadata assignment: @@ -978,7 +991,7 @@ Indeed, it's also possible to assign to the ``metadata`` property with a >>> longitude.metadata DimCoordMetadata(standard_name='longitude', long_name=None, var_name='longitude', units=Unit('degrees'), attributes={}, coord_system=GeogCS(6371229.0), climatological=False, circular=False) >>> longitude.metadata = cube.metadata - >>> longitude.metadata # doctest: +SKIP + >>> longitude.metadata DimCoordMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, coord_system=GeogCS(6371229.0), climatological=False, circular=False) Note that, only **common** metadata members will be assigned new associated diff --git a/docs/src/userguide/iris_cubes.rst b/docs/src/userguide/iris_cubes.rst index 267f97b0fc..03b5093efc 100644 --- a/docs/src/userguide/iris_cubes.rst +++ b/docs/src/userguide/iris_cubes.rst @@ -85,7 +85,10 @@ A cube consists of: data dimensions as the coordinate has dimensions. * an attributes dictionary which, other than some protected CF names, can - hold arbitrary extra metadata. + hold arbitrary extra metadata. This implements the concept of dataset-level + and variable-level attributes when loading and and saving NetCDF files (see + :class:`~iris.cube.CubeAttrsDict` and NetCDF + :func:`~iris.fileformats.netcdf.saver.save` for more). * a list of cell methods to represent operations which have already been applied to the data (e.g. "mean over time") * a list of coordinate "factories" used for deriving coordinates from the diff --git a/docs/src/whatsnew/latest.rst b/docs/src/whatsnew/latest.rst index bad64fccc4..93919216c7 100644 --- a/docs/src/whatsnew/latest.rst +++ b/docs/src/whatsnew/latest.rst @@ -29,6 +29,16 @@ This document explains the changes made to Iris for this release ✨ Features =========== +#. `@pp-mo`_, `@lbdreyer`_ and `@trexfeathers`_ improved + :class:`~iris.cube.Cube` :attr:`~iris.cube.Cube.attributes` handling to + better preserve the distinction between dataset-level and variable-level + attributes, allowing file-Cube-file round-tripping of NetCDF attributes. See + :class:`~iris.cube.CubeAttrsDict`, NetCDF + :func:`~iris.fileformats.netcdf.saver.save` and :data:`~iris.Future` for more. + (:pull:`5152`, `split attributes project`_) + +#. `@rcomer`_ rewrote :func:`~iris.util.broadcast_to_shape` so it now handles + lazy data. (:pull:`5307`) #. `@trexfeathers`_ and `@HGWright`_ (reviewer) sub-categorised all Iris' :class:`UserWarning`\s for richer filtering. The full index of @@ -45,7 +55,7 @@ This document explains the changes made to Iris for this release the year of December) instead of the following year (the default behaviour). (:pull:`5573`) - #. `@HGWright`_ added :attr:`~iris.coords.Coord.ignore_axis` to allow manual +#. `@HGWright`_ added :attr:`~iris.coords.Coord.ignore_axis` to allow manual intervention preventing :func:`~iris.util.guess_coord_axis` from acting on a coordinate. (:pull:`5551`) @@ -152,4 +162,4 @@ This document explains the changes made to Iris for this release .. _NEP29 Drop Schedule: https://numpy.org/neps/nep-0029-deprecation_policy.html#drop-schedule .. _codespell: https://github.com/codespell-project/codespell - +.. _split attributes project: https://github.com/orgs/SciTools/projects/5?pane=info diff --git a/lib/iris/__init__.py b/lib/iris/__init__.py index c29998cd6d..a10169b7bb 100644 --- a/lib/iris/__init__.py +++ b/lib/iris/__init__.py @@ -141,7 +141,9 @@ def callback(cube, field, filename): class Future(threading.local): """Run-time configuration controller.""" - def __init__(self, datum_support=False, pandas_ndim=False): + def __init__( + self, datum_support=False, pandas_ndim=False, save_split_attrs=False + ): """ A container for run-time options controls. @@ -163,6 +165,11 @@ def __init__(self, datum_support=False, pandas_ndim=False): pandas_ndim : bool, default=False See :func:`iris.pandas.as_data_frame` for details - opts in to the newer n-dimensional behaviour. + save_split_attrs : bool, default=False + Save "global" and "local" cube attributes to netcdf in appropriately + different ways : "global" ones are saved as dataset attributes, where + possible, while "local" ones are saved as data-variable attributes. + See :func:`iris.fileformats.netcdf.saver.save`. """ # The flag 'example_future_flag' is provided as a reference for the @@ -174,14 +181,18 @@ def __init__(self, datum_support=False, pandas_ndim=False): # self.__dict__['example_future_flag'] = example_future_flag self.__dict__["datum_support"] = datum_support self.__dict__["pandas_ndim"] = pandas_ndim + self.__dict__["save_split_attrs"] = save_split_attrs + # TODO: next major release: set IrisDeprecation to subclass # DeprecationWarning instead of UserWarning. def __repr__(self): # msg = ('Future(example_future_flag={})') # return msg.format(self.example_future_flag) - msg = "Future(datum_support={}, pandas_ndim={})" - return msg.format(self.datum_support, self.pandas_ndim) + msg = "Future(datum_support={}, pandas_ndim={}, save_split_attrs={})" + return msg.format( + self.datum_support, self.pandas_ndim, self.save_split_attrs + ) # deprecated_options = {'example_future_flag': 'warning',} deprecated_options = {} diff --git a/lib/iris/_merge.py b/lib/iris/_merge.py index bf22f57887..a8f079e70e 100644 --- a/lib/iris/_merge.py +++ b/lib/iris/_merge.py @@ -22,6 +22,9 @@ multidim_lazy_stack, ) from iris.common import CoordMetadata, CubeMetadata +from iris.common._split_attribute_dicts import ( + _convert_splitattrs_to_pairedkeys_dict as convert_splitattrs_to_pairedkeys_dict, +) import iris.coords import iris.cube import iris.exceptions @@ -390,8 +393,10 @@ def _defn_msgs(self, other_defn): ) ) if self_defn.attributes != other_defn.attributes: - diff_keys = set(self_defn.attributes.keys()) ^ set( - other_defn.attributes.keys() + attrs_1, attrs_2 = self_defn.attributes, other_defn.attributes + diff_keys = sorted( + set(attrs_1.globals) ^ set(attrs_2.globals) + | set(attrs_1.locals) ^ set(attrs_2.locals) ) if diff_keys: msgs.append( @@ -399,14 +404,16 @@ def _defn_msgs(self, other_defn): + ", ".join(repr(key) for key in diff_keys) ) else: + attrs_1, attrs_2 = [ + convert_splitattrs_to_pairedkeys_dict(dic) + for dic in (attrs_1, attrs_2) + ] diff_attrs = [ - repr(key) - for key in self_defn.attributes - if np.all( - self_defn.attributes[key] != other_defn.attributes[key] - ) + repr(key[1]) + for key in attrs_1 + if np.all(attrs_1[key] != attrs_2[key]) ] - diff_attrs = ", ".join(diff_attrs) + diff_attrs = ", ".join(sorted(diff_attrs)) msgs.append( "cube.attributes values differ for keys: {}".format( diff_attrs diff --git a/lib/iris/common/_split_attribute_dicts.py b/lib/iris/common/_split_attribute_dicts.py new file mode 100644 index 0000000000..3927974053 --- /dev/null +++ b/lib/iris/common/_split_attribute_dicts.py @@ -0,0 +1,125 @@ +# Copyright Iris contributors +# +# This file is part of Iris and is released under the BSD license. +# See LICENSE in the root of the repository for full licensing details. +""" +Dictionary operations for dealing with the CubeAttrsDict "split"-style attribute +dictionaries. + +The idea here is to convert a split-dictionary into a "plain" one for calculations, +whose keys are all pairs of the form ('global', ) or ('local', ). +And to convert back again after the operation, if the result is a dictionary. + +For "strict" operations this clearly does all that is needed. For lenient ones, +we _might_ want for local+global attributes of the same name to interact. +However, on careful consideration, it seems that this is not actually desirable for +any of the common-metadata operations. +So, we simply treat "global" and "local" attributes of the same name as entirely +independent. Which happily is also the easiest to code, and to explain. +""" +from collections.abc import Mapping, Sequence +from functools import wraps + + +def _convert_splitattrs_to_pairedkeys_dict(dic): + """ + Convert a split-attributes dictionary to a "normal" dict. + + Transform a :class:`~iris.cube.CubeAttributesDict` "split" attributes dictionary + into a 'normal' :class:`dict`, with paired keys of the form ('global', name) or + ('local', name). + + If the input is *not* a split-attrs dict, it is converted to one before + transforming it. This will assign its keys to global/local depending on a standard + set of choices (see :class:`~iris.cube.CubeAttributesDict`). + """ + from iris.cube import CubeAttrsDict + + # Convert input to CubeAttrsDict + if not hasattr(dic, "globals") or not hasattr(dic, "locals"): + dic = CubeAttrsDict(dic) + + def _global_then_local_items(dic): + # Routine to produce global, then local 'items' in order, and with all keys + # "labelled" as local or global type, to ensure they are all unique. + for key, value in dic.globals.items(): + yield ("global", key), value + for key, value in dic.locals.items(): + yield ("local", key), value + + return dict(_global_then_local_items(dic)) + + +def _convert_pairedkeys_dict_to_splitattrs(dic): + """ + Convert an input with global/local paired keys back into a split-attrs dict. + + For now, this is always and only a :class:`iris.cube.CubeAttrsDict`. + """ + from iris.cube import CubeAttrsDict + + result = CubeAttrsDict() + for key, value in dic.items(): + keytype, keyname = key + if keytype == "global": + result.globals[keyname] = value + else: + assert keytype == "local" + result.locals[keyname] = value + return result + + +def adjust_for_split_attribute_dictionaries(operation): + """ + Decorator to make a function of attribute-dictionaries work with split attributes. + + The wrapped function of attribute-dictionaries is currently always one of "equals", + "combine" or "difference", with signatures like : + equals(left: dict, right: dict) -> bool + combine(left: dict, right: dict) -> dict + difference(left: dict, right: dict) -> None | (dict, dict) + + The results of the wrapped operation are either : + * for "equals" (or "__eq__") : a boolean + * for "combine" : a (converted) attributes-dictionary + * for "difference" : a list of (None or "pair"), where a pair contains two + dictionaries + + Before calling the wrapped operation, its inputs (left, right) are modified by + converting any "split" dictionaries to a form where the keys are pairs + of the form ("global", name) or ("local", name). + + After calling the wrapped operation, for "combine" or "difference", the result can + contain a dictionary or dictionaries. These are then transformed back from the + 'converted' form to split-attribute dictionaries, before returning. + + "Split" dictionaries are all of class :class:`~iris.cube.CubeAttrsDict`, since + the only usage of 'split' attribute dictionaries is in Cubes (i.e. they are not + used for cube components). + """ + + @wraps(operation) + def _inner_function(*args, **kwargs): + # Convert all inputs into 'pairedkeys' type dicts + args = [_convert_splitattrs_to_pairedkeys_dict(arg) for arg in args] + + result = operation(*args, **kwargs) + + # Convert known specific cases of 'pairedkeys' dicts in the result, and convert + # those back into split-attribute dictionaries. + if isinstance(result, Mapping): + # Fix a result which is a single dictionary -- for "combine" + result = _convert_pairedkeys_dict_to_splitattrs(result) + elif isinstance(result, Sequence) and len(result) == 2: + # Fix a result which is a pair of dictionaries -- for "difference" + left, right = result + left, right = ( + _convert_pairedkeys_dict_to_splitattrs(left), + _convert_pairedkeys_dict_to_splitattrs(right), + ) + result = result.__class__([left, right]) + # ELSE: leave other types of result unchanged. E.G. None, bool + + return result + + return _inner_function diff --git a/lib/iris/common/metadata.py b/lib/iris/common/metadata.py index 8d60171331..f88a2e57b5 100644 --- a/lib/iris/common/metadata.py +++ b/lib/iris/common/metadata.py @@ -20,6 +20,7 @@ from xxhash import xxh64_hexdigest from ..config import get_logger +from ._split_attribute_dicts import adjust_for_split_attribute_dictionaries from .lenient import _LENIENT from .lenient import _lenient_service as lenient_service from .lenient import _qualname as qualname @@ -241,7 +242,11 @@ def __str__(self): field_strings = [] for field in self._fields: value = getattr(self, field) - if value is None or isinstance(value, (str, dict)) and not value: + if ( + value is None + or isinstance(value, (str, Mapping)) + and not value + ): continue field_strings.append(f"{field}={value}") @@ -1250,6 +1255,46 @@ def _check(item): return result + # + # Override each of the attribute-dict operations in BaseMetadata, to enable + # them to deal with split-attribute dictionaries correctly. + # There are 6 of these, for (equals/combine/difference) * (lenient/strict). + # Each is overridden with a *wrapped* version of the parent method, using the + # "@adjust_for_split_attribute_dictionaries" decorator, which converts any + # split-attribute dictionaries in the inputs to ordinary dicts, and likewise + # re-converts any dictionaries in the return value. + # + + @staticmethod + @adjust_for_split_attribute_dictionaries + def _combine_lenient_attributes(left, right): + return BaseMetadata._combine_lenient_attributes(left, right) + + @staticmethod + @adjust_for_split_attribute_dictionaries + def _combine_strict_attributes(left, right): + return BaseMetadata._combine_strict_attributes(left, right) + + @staticmethod + @adjust_for_split_attribute_dictionaries + def _compare_lenient_attributes(left, right): + return BaseMetadata._compare_lenient_attributes(left, right) + + @staticmethod + @adjust_for_split_attribute_dictionaries + def _compare_strict_attributes(left, right): + return BaseMetadata._compare_strict_attributes(left, right) + + @staticmethod + @adjust_for_split_attribute_dictionaries + def _difference_lenient_attributes(left, right): + return BaseMetadata._difference_lenient_attributes(left, right) + + @staticmethod + @adjust_for_split_attribute_dictionaries + def _difference_strict_attributes(left, right): + return BaseMetadata._difference_strict_attributes(left, right) + class DimCoordMetadata(CoordMetadata): """ diff --git a/lib/iris/common/mixin.py b/lib/iris/common/mixin.py index f3b42fc02d..a1b1e4647b 100644 --- a/lib/iris/common/mixin.py +++ b/lib/iris/common/mixin.py @@ -16,7 +16,7 @@ from .metadata import BaseMetadata -__all__ = ["CFVariableMixin"] +__all__ = ["CFVariableMixin", "LimitedAttributeDict"] def _get_valid_standard_name(name): @@ -52,7 +52,29 @@ def _get_valid_standard_name(name): class LimitedAttributeDict(dict): - _forbidden_keys = ( + """ + A specialised 'dict' subclass, which forbids (errors) certain attribute names. + + Used for the attribute dictionaries of all Iris data objects (that is, + :class:`CFVariableMixin` and its subclasses). + + The "excluded" attributes are those which either :mod:`netCDF4` or Iris intpret and + control with special meaning, which therefore should *not* be defined as custom + 'user' attributes on Iris data objects such as cubes. + + For example : "coordinates", "grid_mapping", "scale_factor". + + The 'forbidden' attributes are those listed in + :data:`iris.common.mixin.LimitedAttributeDict.CF_ATTRS_FORBIDDEN` . + + All the forbidden attributes are amongst those listed in + `Appendix A of the CF Conventions: `_ + -- however, not *all* of them, since not all are interpreted by Iris. + + """ + + #: Attributes with special CF meaning, forbidden in Iris attribute dictionaries. + CF_ATTRS_FORBIDDEN = ( "standard_name", "long_name", "units", @@ -77,7 +99,7 @@ def __init__(self, *args, **kwargs): dict.__init__(self, *args, **kwargs) # Check validity of keys for key in self.keys(): - if key in self._forbidden_keys: + if key in self.CF_ATTRS_FORBIDDEN: raise ValueError(f"{key!r} is not a permitted attribute") def __eq__(self, other): @@ -98,11 +120,12 @@ def __ne__(self, other): return not self == other def __setitem__(self, key, value): - if key in self._forbidden_keys: + if key in self.CF_ATTRS_FORBIDDEN: raise ValueError(f"{key!r} is not a permitted attribute") dict.__setitem__(self, key, value) def update(self, other, **kwargs): + """Standard ``dict.update()`` operation.""" # Gather incoming keys keys = [] if hasattr(other, "keys"): @@ -114,7 +137,7 @@ def update(self, other, **kwargs): # Check validity of keys for key in keys: - if key in self._forbidden_keys: + if key in self.CF_ATTRS_FORBIDDEN: raise ValueError(f"{key!r} is not a permitted attribute") dict.update(self, other, **kwargs) diff --git a/lib/iris/cube.py b/lib/iris/cube.py index 3a36a035c0..8aa0b452d5 100644 --- a/lib/iris/cube.py +++ b/lib/iris/cube.py @@ -9,11 +9,20 @@ """ from collections import OrderedDict -from collections.abc import Container, Iterable, Iterator, MutableMapping import copy from copy import deepcopy from functools import partial, reduce +import itertools import operator +from typing import ( + Container, + Iterable, + Iterator, + Mapping, + MutableMapping, + Optional, + Union, +) import warnings from xml.dom.minidom import Document import zlib @@ -34,12 +43,13 @@ import iris.aux_factory from iris.common import CFVariableMixin, CubeMetadata, metadata_manager_factory from iris.common.metadata import metadata_filter +from iris.common.mixin import LimitedAttributeDict import iris.coord_systems import iris.coords import iris.exceptions import iris.util -__all__ = ["Cube", "CubeList"] +__all__ = ["Cube", "CubeAttrsDict", "CubeList"] # The XML namespace to use for CubeML documents @@ -789,6 +799,352 @@ def _is_single_item(testee): return isinstance(testee, str) or not isinstance(testee, Iterable) +class CubeAttrsDict(MutableMapping): + """ + A :class:`dict`\\-like object for :attr:`iris.cube.Cube.attributes`, + providing unified user access to combined cube "local" and "global" attributes + dictionaries, with the access behaviour of an ordinary (single) dictionary. + + Properties :attr:`globals` and :attr:`locals` are regular + :class:`~iris.common.mixin.LimitedAttributeDict`\\s, which can be accessed and + modified separately. The :class:`CubeAttrsDict` itself contains *no* additional + state, but simply provides a 'combined' view of both global + local attributes. + + All the read- and write-type methods, such as ``get()``, ``update()``, ``values()``, + behave according to the logic documented for : :meth:`__getitem__`, + :meth:`__setitem__` and :meth:`__iter__`. + + Notes + ----- + For type testing, ``issubclass(CubeAttrsDict, Mapping)`` is ``True``, but + ``issubclass(CubeAttrsDict, dict)`` is ``False``. + + Examples + -------- + + >>> from iris.cube import Cube + >>> cube = Cube([0]) + >>> # CF defines 'history' as global by default. + >>> cube.attributes.update({"history": "from test-123", "mycode": 3}) + >>> print(cube.attributes) + {'history': 'from test-123', 'mycode': 3} + >>> print(repr(cube.attributes)) + CubeAttrsDict(globals={'history': 'from test-123'}, locals={'mycode': 3}) + + >>> cube.attributes['history'] += ' +added' + >>> print(repr(cube.attributes)) + CubeAttrsDict(globals={'history': 'from test-123 +added'}, locals={'mycode': 3}) + + >>> cube.attributes.locals['history'] = 'per-variable' + >>> print(cube.attributes) + {'history': 'per-variable', 'mycode': 3} + >>> print(repr(cube.attributes)) + CubeAttrsDict(globals={'history': 'from test-123 +added'}, locals={'mycode': 3, 'history': 'per-variable'}) + + """ + + # TODO: Create a 'further topic' / 'tech paper' on NetCDF I/O, including + # discussion of attribute handling. + + def __init__( + self, + combined: Optional[Union[Mapping, str]] = "__unspecified", + locals: Optional[Mapping] = None, + globals: Optional[Mapping] = None, + ): + """ + Create a cube attributes dictionary. + + We support initialisation from a single generic mapping input, using the default + global/local assignment rules explained at :meth:`__setattr__`, or from + two separate mappings. Two separate dicts can be passed in the ``locals`` + and ``globals`` args, **or** via a ``combined`` arg which has its own + ``.globals`` and ``.locals`` properties -- so this allows passing an existing + :class:`CubeAttrsDict`, which will be copied. + + Parameters + ---------- + combined : dict + values to init both 'self.globals' and 'self.locals'. If 'combined' itself + has attributes named 'locals' and 'globals', these are used to update the + respective content (after initially setting the individual ones). + Otherwise, 'combined' is treated as a generic mapping, applied as + ``self.update(combined)``, + i.e. it will set locals and/or globals with the same logic as + :meth:`~iris.cube.CubeAttrsDict.__setitem__` . + locals : dict + initial content for 'self.locals' + globals : dict + initial content for 'self.globals' + + Examples + -------- + + >>> from iris.cube import CubeAttrsDict + >>> # CF defines 'history' as global by default. + >>> CubeAttrsDict({'history': 'data-story', 'comment': 'this-cube'}) + CubeAttrsDict(globals={'history': 'data-story'}, locals={'comment': 'this-cube'}) + + >>> CubeAttrsDict(locals={'history': 'local-history'}) + CubeAttrsDict(globals={}, locals={'history': 'local-history'}) + + >>> CubeAttrsDict(globals={'x': 'global'}, locals={'x': 'local'}) + CubeAttrsDict(globals={'x': 'global'}, locals={'x': 'local'}) + + >>> x1 = CubeAttrsDict(globals={'x': 1}, locals={'y': 2}) + >>> x2 = CubeAttrsDict(x1) + >>> x2 + CubeAttrsDict(globals={'x': 1}, locals={'y': 2}) + + """ + # First initialise locals + globals, defaulting to empty. + self.locals = locals + self.globals = globals + # Update with combined, if present. + if not isinstance(combined, str) or combined != "__unspecified": + # Treat a single input with 'locals' and 'globals' properties as an + # existing CubeAttrsDict, and update from its content. + # N.B. enforce deep copying, consistent with general Iris usage. + if hasattr(combined, "globals") and hasattr(combined, "locals"): + # Copy a mapping with globals/locals, like another 'CubeAttrsDict' + self.globals.update(deepcopy(combined.globals)) + self.locals.update(deepcopy(combined.locals)) + else: + # Treat any arbitrary single input value as a mapping (dict), and + # update from it. + self.update(dict(deepcopy(combined))) + + # + # Ensure that the stored local/global dictionaries are "LimitedAttributeDicts". + # + @staticmethod + def _normalise_attrs( + attributes: Optional[Mapping], + ) -> LimitedAttributeDict: + # Convert an input attributes arg into a standard form. + # N.B. content is always a LimitedAttributeDict, and a deep copy of input. + # Allow arg of None, etc. + if not attributes: + attributes = {} + else: + attributes = deepcopy(attributes) + + # Ensure the expected mapping type. + attributes = LimitedAttributeDict(attributes) + return attributes + + @property + def locals(self) -> LimitedAttributeDict: + return self._locals + + @locals.setter + def locals(self, attributes: Optional[Mapping]): + self._locals = self._normalise_attrs(attributes) + + @property + def globals(self) -> LimitedAttributeDict: + return self._globals + + @globals.setter + def globals(self, attributes: Optional[Mapping]): + self._globals = self._normalise_attrs(attributes) + + # + # Provide a serialisation interface + # + def __getstate__(self): + return (self.locals, self.globals) + + def __setstate__(self, state): + self.locals, self.globals = state + + # + # Support comparison -- required because default operation only compares a single + # value at each key. + # + def __eq__(self, other): + # For equality, require both globals + locals to match exactly. + # NOTE: array content works correctly, since 'locals' and 'globals' are always + # iris.common.mixin.LimitedAttributeDict, which gets this right. + other = CubeAttrsDict(other) + result = self.locals == other.locals and self.globals == other.globals + return result + + # + # Provide methods duplicating those for a 'dict', but which are *not* provided by + # MutableMapping, for compatibility with code which expected a cube.attributes to be + # a :class:`~iris.common.mixin.LimitedAttributeDict`. + # The extra required methods are : + # 'copy', 'update', '__ior__', '__or__', '__ror__' and 'fromkeys'. + # + def copy(self): + """ + Return a copy. + + Implemented with deep copying, consistent with general Iris usage. + + """ + return CubeAttrsDict(self) + + def update(self, *args, **kwargs): + """ + Update by adding items from a mapping arg, or keyword-values. + + If the argument is a split dictionary, preserve the local/global nature of its + keys. + """ + if args and hasattr(args[0], "globals") and hasattr(args[0], "locals"): + dic = args[0] + self.globals.update(dic.globals) + self.locals.update(dic.locals) + else: + super().update(*args) + super().update(**kwargs) + + def __or__(self, arg): + """Implement 'or' via 'update'.""" + if not isinstance(arg, Mapping): + return NotImplemented + new_dict = self.copy() + new_dict.update(arg) + return new_dict + + def __ior__(self, arg): + """Implement 'ior' via 'update'.""" + self.update(arg) + return self + + def __ror__(self, arg): + """ + Implement 'ror' via 'update'. + + This needs to promote, such that the result is a CubeAttrsDict. + """ + if not isinstance(arg, Mapping): + return NotImplemented + result = CubeAttrsDict(arg) + result.update(self) + return result + + @classmethod + def fromkeys(cls, iterable, value=None): + """ + Create a new object with keys taken from an argument, all set to one value. + + If the argument is a split dictionary, preserve the local/global nature of its + keys. + """ + if hasattr(iterable, "globals") and hasattr(iterable, "locals"): + # When main input is a split-attrs dict, create global/local parts from its + # global/local keys + result = cls( + globals=dict.fromkeys(iterable.globals, value), + locals=dict.fromkeys(iterable.locals, value), + ) + else: + # Create from a dict.fromkeys, using default classification of the keys. + result = cls(dict.fromkeys(iterable, value)) + return result + + # + # The remaining methods are sufficient to generate a complete standard Mapping + # API. See - + # https://docs.python.org/3/reference/datamodel.html#emulating-container-types. + # + + def __iter__(self): + """ + Define the combined iteration order. + + Result is: all global keys, then all local ones, but omitting duplicates. + + """ + # NOTE: this means that in the "summary" view, attributes present in both + # locals+globals are listed first, amongst the globals, even though they appear + # with the *value* from locals. + # Otherwise follows order of insertion, as is normal for dicts. + return itertools.chain( + self.globals.keys(), + (x for x in self.locals.keys() if x not in self.globals), + ) + + def __len__(self): + # Return the number of keys in the 'combined' view. + return len(list(iter(self))) + + def __getitem__(self, key): + """ + Fetch an item from the "combined attributes". + + If the name is present in *both* ``self.locals`` and ``self.globals``, then + the local value is returned. + + """ + if key in self.locals: + store = self.locals + else: + store = self.globals + return store[key] + + def __setitem__(self, key, value): + """ + Assign an attribute value. + + This may be assigned in either ``self.locals`` or ``self.globals``, chosen as + follows: + + * If there is an existing setting in either ``.locals`` or ``.globals``, then + that is updated (i.e. overwritten). + + * If it is present in *both*, only + ``.locals`` is updated. + + * If there is *no* existing attribute, it is usually created in ``.locals``. + **However** a handful of "known normally global" cases, as defined by CF, + go into ``.globals`` instead. + At present these are : ('conventions', 'featureType', 'history', 'title'). + See `CF Conventions, Appendix A: `_ . + + """ + # If an attribute of this name is already present, update that + # (the local one having priority). + if key in self.locals: + store = self.locals + elif key in self.globals: + store = self.globals + else: + # If NO existing attribute, create local unless it is a "known global" one. + from iris.fileformats.netcdf.saver import _CF_GLOBAL_ATTRS + + if key in _CF_GLOBAL_ATTRS: + store = self.globals + else: + store = self.locals + + store[key] = value + + def __delitem__(self, key): + """ + Remove an attribute. + + Delete from both local + global. + + """ + if key in self.locals: + del self.locals[key] + if key in self.globals: + del self.globals[key] + + def __str__(self): + # Print it just like a "normal" dictionary. + # Convert to a normal dict to do that. + return str(dict(self)) + + def __repr__(self): + # Special repr form, showing "real" contents. + return f"CubeAttrsDict(globals={self.globals}, locals={self.locals})" + + class Cube(CFVariableMixin): """ A single Iris cube of data and metadata. @@ -985,8 +1341,8 @@ def __init__( self.cell_methods = cell_methods - #: A dictionary, with a few restricted keys, for arbitrary - #: Cube metadata. + #: A dictionary for arbitrary Cube metadata. + #: A few keys are restricted - see :class:`CubeAttrsDict`. self.attributes = attributes # Coords @@ -1044,6 +1400,22 @@ def _names(self): """ return self._metadata_manager._names + # + # Ensure that .attributes is always a :class:`CubeAttrsDict`. + # + @property + def attributes(self) -> CubeAttrsDict: + return super().attributes + + @attributes.setter + def attributes(self, attributes: Optional[Mapping]): + """ + An override to CfVariableMixin.attributes.setter, which ensures that Cube + attributes are stored in a way which distinguishes global + local ones. + + """ + self._metadata_manager.attributes = CubeAttrsDict(attributes or {}) + def _dimensional_metadata(self, name_or_dimensional_metadata): """ Return a single _DimensionalMetadata instance that matches the given diff --git a/lib/iris/fileformats/_nc_load_rules/helpers.py b/lib/iris/fileformats/_nc_load_rules/helpers.py index 71e59feda0..7044b3a993 100644 --- a/lib/iris/fileformats/_nc_load_rules/helpers.py +++ b/lib/iris/fileformats/_nc_load_rules/helpers.py @@ -482,9 +482,9 @@ def build_cube_metadata(engine): # Set the cube global attributes. for attr_name, attr_value in cf_var.cf_group.global_attributes.items(): try: - cube.attributes[str(attr_name)] = attr_value + cube.attributes.globals[str(attr_name)] = attr_value except ValueError as e: - msg = "Skipping global attribute {!r}: {}" + msg = "Skipping disallowed global attribute {!r}: {}" warnings.warn( msg.format(attr_name, str(e)), category=_WarnComboIgnoringLoad, diff --git a/lib/iris/fileformats/netcdf/loader.py b/lib/iris/fileformats/netcdf/loader.py index 030427a2b9..f0ed111687 100644 --- a/lib/iris/fileformats/netcdf/loader.py +++ b/lib/iris/fileformats/netcdf/loader.py @@ -167,8 +167,13 @@ def attribute_predicate(item): return item[0] not in _CF_ATTRS tmpvar = filter(attribute_predicate, cf_var.cf_attrs_unused()) + attrs_dict = iris_object.attributes + if hasattr(attrs_dict, "locals"): + # Treat cube attributes (i.e. a CubeAttrsDict) as a special case. + # These attrs are "local" (i.e. on the variable), so record them as such. + attrs_dict = attrs_dict.locals for attr_name, attr_value in tmpvar: - _set_attributes(iris_object.attributes, attr_name, attr_value) + _set_attributes(attrs_dict, attr_name, attr_value) def _get_actual_dtype(cf_var): diff --git a/lib/iris/fileformats/netcdf/saver.py b/lib/iris/fileformats/netcdf/saver.py index 6409d8c311..fcbc9a5383 100644 --- a/lib/iris/fileformats/netcdf/saver.py +++ b/lib/iris/fileformats/netcdf/saver.py @@ -27,6 +27,7 @@ from dask.delayed import Delayed import numpy as np +from iris._deprecation import warn_deprecated from iris._lazy_data import _co_realise_lazy_arrays, is_lazy_data from iris.aux_factory import ( AtmosphereSigmaFactory, @@ -560,6 +561,10 @@ def write( matching keys will become attributes on the data variable rather than global attributes. + .. Note:: + + Has no effect if :attr:`iris.FUTURE.save_split_attrs` is ``True``. + * unlimited_dimensions (iterable of strings and/or :class:`iris.coords.Coord` objects): List of coordinate names (or coordinate objects) @@ -652,6 +657,9 @@ def write( 3 files that do not use HDF5. """ + # TODO: when iris.FUTURE.save_split_attrs defaults to True, we can deprecate the + # "local_keys" arg, and finally remove it when we finally remove the + # save_split_attrs switch. if unlimited_dimensions is None: unlimited_dimensions = [] @@ -728,20 +736,23 @@ def write( # aux factory in the cube. self._add_aux_factories(cube, cf_var_cube, cube_dimensions) - # Add data variable-only attribute names to local_keys. - if local_keys is None: - local_keys = set() - else: - local_keys = set(local_keys) - local_keys.update(_CF_DATA_ATTRS, _UKMO_DATA_ATTRS) - - # Add global attributes taking into account local_keys. - global_attributes = { - k: v - for k, v in cube.attributes.items() - if (k not in local_keys and k.lower() != "conventions") - } - self.update_global_attributes(global_attributes) + if not iris.FUTURE.save_split_attrs: + # In the "old" way, we update global attributes as we go. + # Add data variable-only attribute names to local_keys. + if local_keys is None: + local_keys = set() + else: + local_keys = set(local_keys) + local_keys.update(_CF_DATA_ATTRS, _UKMO_DATA_ATTRS) + + # Add global attributes taking into account local_keys. + cube_attributes = cube.attributes + global_attributes = { + k: v + for k, v in cube_attributes.items() + if (k not in local_keys and k.lower() != "conventions") + } + self.update_global_attributes(global_attributes) if cf_profile_available: cf_patch = iris.site_configuration.get("cf_patch") @@ -797,6 +808,9 @@ def update_global_attributes(self, attributes=None, **kwargs): CF global attributes to be updated. """ + # TODO: when when iris.FUTURE.save_split_attrs is removed, this routine will + # only be called once: it can reasonably be renamed "_set_global_attributes", + # and the 'kwargs' argument can be removed. if attributes is not None: # Handle sequence e.g. [('fruit', 'apple'), ...]. if not hasattr(attributes, "keys"): @@ -2266,6 +2280,8 @@ def _create_cf_data_variable( """ Create CF-netCDF data variable for the cube and any associated grid mapping. + # TODO: when iris.FUTURE.save_split_attrs is removed, the 'local_keys' arg can + # be removed. Args: @@ -2290,6 +2306,8 @@ def _create_cf_data_variable( The newly created CF-netCDF data variable. """ + # TODO: when iris.FUTURE.save_split_attrs is removed, the 'local_keys' arg can + # be removed. # Get the values in a form which is valid for the file format. data = self._ensure_valid_dtype(cube.core_data(), "cube", cube) @@ -2378,16 +2396,20 @@ def set_packing_ncattrs(cfvar): if cube.units.calendar: _setncattr(cf_var, "calendar", cube.units.calendar) - # Add data variable-only attribute names to local_keys. - if local_keys is None: - local_keys = set() + if iris.FUTURE.save_split_attrs: + attr_names = cube.attributes.locals.keys() else: - local_keys = set(local_keys) - local_keys.update(_CF_DATA_ATTRS, _UKMO_DATA_ATTRS) + # Add data variable-only attribute names to local_keys. + if local_keys is None: + local_keys = set() + else: + local_keys = set(local_keys) + local_keys.update(_CF_DATA_ATTRS, _UKMO_DATA_ATTRS) + + # Add any cube attributes whose keys are in local_keys as + # CF-netCDF data variable attributes. + attr_names = set(cube.attributes).intersection(local_keys) - # Add any cube attributes whose keys are in local_keys as - # CF-netCDF data variable attributes. - attr_names = set(cube.attributes).intersection(local_keys) for attr_name in sorted(attr_names): # Do not output 'conventions' attribute. if attr_name.lower() == "conventions": @@ -2673,9 +2695,15 @@ def save( Save cube(s) to a netCDF file, given the cube and the filename. * Iris will write CF 1.7 compliant NetCDF files. - * The attributes dictionaries on each cube in the saved cube list - will be compared and common attributes saved as NetCDF global - attributes where appropriate. + * **If split-attribute saving is disabled**, i.e. + :data:`iris.FUTURE`\\ ``.save_split_attrs`` is ``False``, then attributes + dictionaries on each cube in the saved cube list will be compared, and common + attributes saved as NetCDF global attributes where appropriate. + + Or, **when split-attribute saving is enabled**, then ``cube.attributes.locals`` + are always saved as attributes of data-variables, and ``cube.attributes.globals`` + are saved as global (dataset) attributes, where possible. + Since the 2 types are now distinguished : see :class:`~iris.cube.CubeAttrsDict`. * Keyword arguments specifying how to save the data are applied to each cube. To use different settings for different cubes, use the NetCDF Context manager (:class:`~Saver`) directly. @@ -2708,6 +2736,8 @@ def save( An interable of cube attribute keys. Any cube attributes with matching keys will become attributes on the data variable rather than global attributes. + **NOTE:** this is *ignored* if 'split-attribute saving' is **enabled**, + i.e. when ``iris.FUTURE.save_split_attrs`` is ``True``. * unlimited_dimensions (iterable of strings and/or :class:`iris.coords.Coord` objects): @@ -2846,26 +2876,127 @@ def save( else: cubes = cube - if local_keys is None: + # Decide which cube attributes will be saved as "global" attributes + # NOTE: in 'legacy' mode, when iris.FUTURE.save_split_attrs == False, this code + # section derives a common value for 'local_keys', which is passed to 'Saver.write' + # when saving each input cube. The global attributes are then created by a call + # to "Saver.update_global_attributes" within each 'Saver.write' call (which is + # obviously a bit redundant!), plus an extra one to add 'Conventions'. + # HOWEVER, in `split_attrs` mode (iris.FUTURE.save_split_attrs == False), this code + # instead constructs a 'global_attributes' dictionary, and outputs that just once, + # after writing all the input cubes. + if iris.FUTURE.save_split_attrs: + # We don't actually use 'local_keys' in this case. + # TODO: can remove this when the iris.FUTURE.save_split_attrs is removed. local_keys = set() + + # Find any collisions in the cube global attributes and "demote" all those to + # local attributes (where possible, else warn they are lost). + # N.B. "collision" includes when not all cubes *have* that attribute. + global_names = set() + for cube in cubes: + global_names |= set(cube.attributes.globals.keys()) + + # Fnd any global attributes which are not the same on *all* cubes. + def attr_values_equal(val1, val2): + # An equality test which also works when some values are numpy arrays (!) + # As done in :meth:`iris.common.mixin.LimitedAttributeDict.__eq__`. + match = val1 == val2 + try: + match = bool(match) + except ValueError: + match = match.all() + return match + + cube0 = cubes[0] + invalid_globals = set( + [ + attrname + for attrname in global_names + if not all( + attr_values_equal( + cube.attributes.globals.get(attrname), + cube0.attributes.globals.get(attrname), + ) + for cube in cubes[1:] + ) + ] + ) + + # Establish all the global attributes which we will write to the file (at end). + global_attributes = { + attr: cube0.attributes.globals.get(attr) + for attr in global_names - invalid_globals + } + if invalid_globals: + # Some cubes have different global attributes: modify cubes as required. + warnings.warn( + f"Saving the cube global attributes {sorted(invalid_globals)} as local " + "(i.e. data-variable) attributes, where possible, since they are not " + "the same on all input cubes.", + category=iris.exceptions.IrisSaveWarning, + ) + cubes = cubes.copy() # avoiding modifying the actual input arg. + for i_cube in range(len(cubes)): + # We iterate over cube *index*, so we can replace the list entries with + # with cube *copies* -- just to avoid changing our call args. + cube = cubes[i_cube] + demote_attrs = set(cube.attributes.globals) & invalid_globals + if any(demote_attrs): + # Catch any demoted attrs where there is already a local version + blocked_attrs = demote_attrs & set(cube.attributes.locals) + if blocked_attrs: + warnings.warn( + f"Global cube attributes {sorted(blocked_attrs)} " + f'of cube "{cube.name()}" were not saved, overlaid ' + "by existing local attributes with the same names.", + category=iris.exceptions.IrisSaveWarning, + ) + demote_attrs -= blocked_attrs + if demote_attrs: + # This cube contains some 'demoted' global attributes. + # Replace input cube with a copy, so we can modify attributes. + cube = cube.copy() + cubes[i_cube] = cube + for attr in demote_attrs: + # move global to local + value = cube.attributes.globals.pop(attr) + cube.attributes.locals[attr] = value + else: - local_keys = set(local_keys) - - # Determine the attribute keys that are common across all cubes and - # thereby extend the collection of local_keys for attributes - # that should be attributes on data variables. - attributes = cubes[0].attributes - common_keys = set(attributes) - for cube in cubes[1:]: - keys = set(cube.attributes) - local_keys.update(keys.symmetric_difference(common_keys)) - common_keys.intersection_update(keys) - different_value_keys = [] - for key in common_keys: - if np.any(attributes[key] != cube.attributes[key]): - different_value_keys.append(key) - common_keys.difference_update(different_value_keys) - local_keys.update(different_value_keys) + # Legacy mode: calculate "local_keys" to control which attributes are local + # and which global. + # TODO: when iris.FUTURE.save_split_attrs is removed, this section can also be + # removed + message = ( + "Saving to netcdf with legacy-style attribute handling for backwards " + "compatibility.\n" + "This mode is deprecated since Iris 3.8, and will eventually be removed.\n" + "Please consider enabling the new split-attributes handling mode, by " + "setting 'iris.FUTURE.save_split_attrs = True'." + ) + warn_deprecated(message) + + if local_keys is None: + local_keys = set() + else: + local_keys = set(local_keys) + + # Determine the attribute keys that are common across all cubes and + # thereby extend the collection of local_keys for attributes + # that should be attributes on data variables. + attributes = cubes[0].attributes + common_keys = set(attributes) + for cube in cubes[1:]: + keys = set(cube.attributes) + local_keys.update(keys.symmetric_difference(common_keys)) + common_keys.intersection_update(keys) + different_value_keys = [] + for key in common_keys: + if np.any(attributes[key] != cube.attributes[key]): + different_value_keys.append(key) + common_keys.difference_update(different_value_keys) + local_keys.update(different_value_keys) def is_valid_packspec(p): """Only checks that the datatype is valid.""" @@ -2967,7 +3098,12 @@ def is_valid_packspec(p): warnings.warn(msg, category=iris.exceptions.IrisCfSaveWarning) # Add conventions attribute. - sman.update_global_attributes(Conventions=conventions) + if iris.FUTURE.save_split_attrs: + # In the "new way", we just create all the global attributes at once. + global_attributes["Conventions"] = conventions + sman.update_global_attributes(global_attributes) + else: + sman.update_global_attributes(Conventions=conventions) if compute: # No more to do, since we used Saver(compute=True). diff --git a/lib/iris/tests/integration/attrs_matrix_results_load.json b/lib/iris/tests/integration/attrs_matrix_results_load.json new file mode 100644 index 0000000000..a1d37708a9 --- /dev/null +++ b/lib/iris/tests/integration/attrs_matrix_results_load.json @@ -0,0 +1,1019 @@ +{ + "case_singlevar_localonly": { + "input": "G-La", + "localstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "G-La" + ] + }, + "globalstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "G-La" + ] + }, + "userstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "G-La" + ] + } + }, + "case_singlevar_globalonly": { + "input": "GaL-", + "localstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "GaL-" + ] + }, + "globalstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "GaL-" + ] + }, + "userstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "GaL-" + ] + } + }, + "case_singlevar_glsame": { + "input": "GaLa", + "localstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "GaLa" + ] + }, + "globalstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "GaLa" + ] + }, + "userstyle": { + "legacy": [ + "G-La" + ], + "newstyle": [ + "GaLa" + ] + } + }, + "case_singlevar_gldiffer": { + "input": "GaLb", + "localstyle": { + "legacy": [ + "G-Lb" + ], + "newstyle": [ + "GaLb" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lb" + ], + "newstyle": [ + "GaLb" + ] + }, + "userstyle": { + "legacy": [ + "G-Lb" + ], + "newstyle": [ + "GaLb" + ] + } + }, + "case_multivar_same_noglobal": { + "input": "G-Laa", + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-Laa" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-Laa" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-Laa" + ] + } + }, + "case_multivar_same_sameglobal": { + "input": "GaLaa", + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLaa" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLaa" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLaa" + ] + } + }, + "case_multivar_same_diffglobal": { + "input": "GaLbb", + "localstyle": { + "legacy": [ + "G-Lbb" + ], + "newstyle": [ + "GaLbb" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lbb" + ], + "newstyle": [ + "GaLbb" + ] + }, + "userstyle": { + "legacy": [ + "G-Lbb" + ], + "newstyle": [ + "GaLbb" + ] + } + }, + "case_multivar_differ_noglobal": { + "input": "G-Lab", + "localstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lab" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lab" + ] + }, + "userstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lab" + ] + } + }, + "case_multivar_differ_diffglobal": { + "input": "GaLbc", + "localstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLbc" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLbc" + ] + }, + "userstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLbc" + ] + } + }, + "case_multivar_differ_sameglobal": { + "input": "GaLab", + "localstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLab" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLab" + ] + }, + "userstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLab" + ] + } + }, + "case_multivar_1none_noglobal": { + "input": "G-La-", + "localstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-La-" + ] + }, + "globalstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-La-" + ] + }, + "userstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-La-" + ] + } + }, + "case_multivar_1none_diffglobal": { + "input": "GaLb-", + "localstyle": { + "legacy": [ + "G-Lba" + ], + "newstyle": [ + "GaLb-" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lba" + ], + "newstyle": [ + "GaLb-" + ] + }, + "userstyle": { + "legacy": [ + "G-Lba" + ], + "newstyle": [ + "GaLb-" + ] + } + }, + "case_multivar_1none_sameglobal": { + "input": "GaLa-", + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLa-" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLa-" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLa-" + ] + } + }, + "case_multisource_gsame_lnone": { + "input": [ + "GaL-", + "GaL-" + ], + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaL--" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaL--" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaL--" + ] + } + }, + "case_multisource_gsame_lallsame": { + "input": [ + "GaLa", + "GaLa" + ], + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLaa" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLaa" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLaa" + ] + } + }, + "case_multisource_gsame_l1same1none": { + "input": [ + "GaLa", + "GaL-" + ], + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLa-" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLa-" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "GaLa-" + ] + } + }, + "case_multisource_gsame_l1same1other": { + "input": [ + "GaLa", + "GaLb" + ], + "localstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLab" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLab" + ] + }, + "userstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLab" + ] + } + }, + "case_multisource_gsame_lallother": { + "input": [ + "GaLb", + "GaLb" + ], + "localstyle": { + "legacy": [ + "G-Lbb" + ], + "newstyle": [ + "GaLbb" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lbb" + ], + "newstyle": [ + "GaLbb" + ] + }, + "userstyle": { + "legacy": [ + "G-Lbb" + ], + "newstyle": [ + "GaLbb" + ] + } + }, + "case_multisource_gsame_lalldiffer": { + "input": [ + "GaLb", + "GaLc" + ], + "localstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLbc" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLbc" + ] + }, + "userstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLbc" + ] + } + }, + "case_multisource_gnone_l1one1none": { + "input": [ + "G-La", + "G-L-" + ], + "localstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-La-" + ] + }, + "globalstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-La-" + ] + }, + "userstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-La-" + ] + } + }, + "case_multisource_gnone_l1one1same": { + "input": [ + "G-La", + "G-La" + ], + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-Laa" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-Laa" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-Laa" + ] + } + }, + "case_multisource_gnone_l1one1other": { + "input": [ + "G-La", + "G-Lb" + ], + "localstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lab" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lab" + ] + }, + "userstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lab" + ] + } + }, + "case_multisource_g1none_lnone": { + "input": [ + "GaL-", + "G-L-" + ], + "localstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-L-", + "GaL-" + ] + }, + "globalstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-L-", + "GaL-" + ] + }, + "userstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-L-", + "GaL-" + ] + } + }, + "case_multisource_g1none_l1same1none": { + "input": [ + "GaLa", + "G-L-" + ], + "localstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-L-", + "GaLa" + ] + }, + "globalstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-L-", + "GaLa" + ] + }, + "userstyle": { + "legacy": [ + "G-La-" + ], + "newstyle": [ + "G-L-", + "GaLa" + ] + } + }, + "case_multisource_g1none_l1none1same": { + "input": [ + "GaL-", + "G-La" + ], + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-La", + "GaL-" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-La", + "GaL-" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-La", + "GaL-" + ] + } + }, + "case_multisource_g1none_l1diff1none": { + "input": [ + "GaLb", + "G-L-" + ], + "localstyle": { + "legacy": [ + "G-Lb-" + ], + "newstyle": [ + "G-L-", + "GaLb" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lb-" + ], + "newstyle": [ + "G-L-", + "GaLb" + ] + }, + "userstyle": { + "legacy": [ + "G-Lb-" + ], + "newstyle": [ + "G-L-", + "GaLb" + ] + } + }, + "case_multisource_g1none_l1none1diff": { + "input": [ + "GaL-", + "G-Lb" + ], + "localstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lb", + "GaL-" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lb", + "GaL-" + ] + }, + "userstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "G-Lb", + "GaL-" + ] + } + }, + "case_multisource_g1none_lallsame": { + "input": [ + "GaLa", + "G-La" + ], + "localstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-La", + "GaLa" + ] + }, + "globalstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-La", + "GaLa" + ] + }, + "userstyle": { + "legacy": [ + "G-Laa" + ], + "newstyle": [ + "G-La", + "GaLa" + ] + } + }, + "case_multisource_g1none_lallother": { + "input": [ + "GaLc", + "G-Lc" + ], + "localstyle": { + "legacy": [ + "G-Lcc" + ], + "newstyle": [ + "G-Lc", + "GaLc" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lcc" + ], + "newstyle": [ + "G-Lc", + "GaLc" + ] + }, + "userstyle": { + "legacy": [ + "G-Lcc" + ], + "newstyle": [ + "G-Lc", + "GaLc" + ] + } + }, + "case_multisource_gdiff_lnone": { + "input": [ + "GaL-", + "GbL-" + ], + "localstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaL-", + "GbL-" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaL-", + "GbL-" + ] + }, + "userstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaL-", + "GbL-" + ] + } + }, + "case_multisource_gdiff_l1same1none": { + "input": [ + "GaLa", + "GbL-" + ], + "localstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLa", + "GbL-" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLa", + "GbL-" + ] + }, + "userstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLa", + "GbL-" + ] + } + }, + "case_multisource_gdiff_l1diff1none": { + "input": [ + "GaLb", + "GcL-" + ], + "localstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLb", + "GcL-" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLb", + "GcL-" + ] + }, + "userstyle": { + "legacy": [ + "G-Lbc" + ], + "newstyle": [ + "GaLb", + "GcL-" + ] + } + }, + "case_multisource_gdiff_lallsame": { + "input": [ + "GaLa", + "GbLb" + ], + "localstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLa", + "GbLb" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLa", + "GbLb" + ] + }, + "userstyle": { + "legacy": [ + "G-Lab" + ], + "newstyle": [ + "GaLa", + "GbLb" + ] + } + }, + "case_multisource_gdiff_lallother": { + "input": [ + "GaLc", + "GbLc" + ], + "localstyle": { + "legacy": [ + "G-Lcc" + ], + "newstyle": [ + "GaLc", + "GbLc" + ] + }, + "globalstyle": { + "legacy": [ + "G-Lcc" + ], + "newstyle": [ + "GaLc", + "GbLc" + ] + }, + "userstyle": { + "legacy": [ + "G-Lcc" + ], + "newstyle": [ + "GaLc", + "GbLc" + ] + } + } +} \ No newline at end of file diff --git a/lib/iris/tests/integration/attrs_matrix_results_roundtrip.json b/lib/iris/tests/integration/attrs_matrix_results_roundtrip.json new file mode 100644 index 0000000000..3446c7f312 --- /dev/null +++ b/lib/iris/tests/integration/attrs_matrix_results_roundtrip.json @@ -0,0 +1,983 @@ +{ + "case_singlevar_localonly": { + "input": "G-La", + "localstyle": { + "unsplit": [ + "G-La" + ], + "split": [ + "G-La" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "G-La" + ] + }, + "userstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "G-La" + ] + } + }, + "case_singlevar_globalonly": { + "input": "GaL-", + "localstyle": { + "unsplit": [ + "G-La" + ], + "split": [ + "GaL-" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "GaL-" + ] + }, + "userstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "GaL-" + ] + } + }, + "case_singlevar_glsame": { + "input": "GaLa", + "localstyle": { + "unsplit": [ + "G-La" + ], + "split": [ + "GaLa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "GaLa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "GaLa" + ] + } + }, + "case_singlevar_gldiffer": { + "input": "GaLb", + "localstyle": { + "unsplit": [ + "G-Lb" + ], + "split": [ + "GaLb" + ] + }, + "globalstyle": { + "unsplit": [ + "GbL-" + ], + "split": [ + "GaLb" + ] + }, + "userstyle": { + "unsplit": [ + "GbL-" + ], + "split": [ + "GaLb" + ] + } + }, + "case_multivar_same_noglobal": { + "input": "G-Laa", + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "G-Laa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + } + }, + "case_multivar_same_sameglobal": { + "input": "GaLaa", + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaLaa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLaa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLaa" + ] + } + }, + "case_multivar_same_diffglobal": { + "input": "GaLbb", + "localstyle": { + "unsplit": [ + "G-Lbb" + ], + "split": [ + "GaLbb" + ] + }, + "globalstyle": { + "unsplit": [ + "GbL--" + ], + "split": [ + "GaLbb" + ] + }, + "userstyle": { + "unsplit": [ + "GbL--" + ], + "split": [ + "GaLbb" + ] + } + }, + "case_multivar_differ_noglobal": { + "input": "G-Lab", + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multivar_differ_diffglobal": { + "input": "GaLbc", + "localstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + } + }, + "case_multivar_differ_sameglobal": { + "input": "GaLab", + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + } + }, + "case_multivar_1none_noglobal": { + "input": "G-La-", + "localstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "userstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + } + }, + "case_multivar_1none_diffglobal": { + "input": "GaLb-", + "localstyle": { + "unsplit": [ + "G-Lba" + ], + "split": [ + "GaLb-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lba" + ], + "split": [ + "GaLb-" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lba" + ], + "split": [ + "GaLb-" + ] + } + }, + "case_multivar_1none_sameglobal": { + "input": "GaLa-", + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaLa-" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLa-" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLa-" + ] + } + }, + "case_multisource_gsame_lnone": { + "input": [ + "GaL-", + "GaL-" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaL--" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaL--" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaL--" + ] + } + }, + "case_multisource_gsame_lallsame": { + "input": [ + "GaLa", + "GaLa" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaLaa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLaa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLaa" + ] + } + }, + "case_multisource_gsame_l1same1none": { + "input": [ + "GaLa", + "GaL-" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaLa-" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLa-" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLa-" + ] + } + }, + "case_multisource_gsame_l1same1other": { + "input": [ + "GaLa", + "GaLb" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + } + }, + "case_multisource_gsame_lallother": { + "input": [ + "GaLb", + "GaLb" + ], + "localstyle": { + "unsplit": [ + "G-Lbb" + ], + "split": [ + "GaLbb" + ] + }, + "globalstyle": { + "unsplit": [ + "GbL--" + ], + "split": [ + "GaLbb" + ] + }, + "userstyle": { + "unsplit": [ + "GbL--" + ], + "split": [ + "GaLbb" + ] + } + }, + "case_multisource_gsame_lalldiffer": { + "input": [ + "GaLb", + "GaLc" + ], + "localstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + } + }, + "case_multisource_gnone_l1one1none": { + "input": [ + "G-La", + "G-L-" + ], + "localstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "userstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + } + }, + "case_multisource_gnone_l1one1same": { + "input": [ + "G-La", + "G-La" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "G-Laa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + } + }, + "case_multisource_gnone_l1one1other": { + "input": [ + "G-La", + "G-Lb" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_g1none_lnone": { + "input": [ + "GaL-", + "G-L-" + ], + "localstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "userstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + } + }, + "case_multisource_g1none_l1same1none": { + "input": [ + "GaLa", + "G-L-" + ], + "localstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "userstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + } + }, + "case_multisource_g1none_l1none1same": { + "input": [ + "GaL-", + "G-La" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "G-Laa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + } + }, + "case_multisource_g1none_l1diff1none": { + "input": [ + "GaLb", + "G-L-" + ], + "localstyle": { + "unsplit": [ + "G-Lb-" + ], + "split": [ + "G-Lb-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lb-" + ], + "split": [ + "G-Lb-" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lb-" + ], + "split": [ + "G-Lb-" + ] + } + }, + "case_multisource_g1none_l1none1diff": { + "input": [ + "GaL-", + "G-Lb" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_g1none_lallsame": { + "input": [ + "GaLa", + "G-La" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "G-Laa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + } + }, + "case_multisource_g1none_lallother": { + "input": [ + "GaLc", + "G-Lc" + ], + "localstyle": { + "unsplit": [ + "G-Lcc" + ], + "split": [ + "G-Lcc" + ] + }, + "globalstyle": { + "unsplit": [ + "GcL--" + ], + "split": [ + "G-Lcc" + ] + }, + "userstyle": { + "unsplit": [ + "GcL--" + ], + "split": [ + "G-Lcc" + ] + } + }, + "case_multisource_gdiff_lnone": { + "input": [ + "GaL-", + "GbL-" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_gdiff_l1same1none": { + "input": [ + "GaLa", + "GbL-" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_gdiff_l1diff1none": { + "input": [ + "GaLb", + "GcL-" + ], + "localstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "G-Lbc" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "G-Lbc" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "G-Lbc" + ] + } + }, + "case_multisource_gdiff_lallsame": { + "input": [ + "GaLa", + "GbLb" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_gdiff_lallother": { + "input": [ + "GaLc", + "GbLc" + ], + "localstyle": { + "unsplit": [ + "G-Lcc" + ], + "split": [ + "G-Lcc" + ] + }, + "globalstyle": { + "unsplit": [ + "GcL--" + ], + "split": [ + "G-Lcc" + ] + }, + "userstyle": { + "unsplit": [ + "GcL--" + ], + "split": [ + "G-Lcc" + ] + } + } +} \ No newline at end of file diff --git a/lib/iris/tests/integration/attrs_matrix_results_save.json b/lib/iris/tests/integration/attrs_matrix_results_save.json new file mode 100644 index 0000000000..3446c7f312 --- /dev/null +++ b/lib/iris/tests/integration/attrs_matrix_results_save.json @@ -0,0 +1,983 @@ +{ + "case_singlevar_localonly": { + "input": "G-La", + "localstyle": { + "unsplit": [ + "G-La" + ], + "split": [ + "G-La" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "G-La" + ] + }, + "userstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "G-La" + ] + } + }, + "case_singlevar_globalonly": { + "input": "GaL-", + "localstyle": { + "unsplit": [ + "G-La" + ], + "split": [ + "GaL-" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "GaL-" + ] + }, + "userstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "GaL-" + ] + } + }, + "case_singlevar_glsame": { + "input": "GaLa", + "localstyle": { + "unsplit": [ + "G-La" + ], + "split": [ + "GaLa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "GaLa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL-" + ], + "split": [ + "GaLa" + ] + } + }, + "case_singlevar_gldiffer": { + "input": "GaLb", + "localstyle": { + "unsplit": [ + "G-Lb" + ], + "split": [ + "GaLb" + ] + }, + "globalstyle": { + "unsplit": [ + "GbL-" + ], + "split": [ + "GaLb" + ] + }, + "userstyle": { + "unsplit": [ + "GbL-" + ], + "split": [ + "GaLb" + ] + } + }, + "case_multivar_same_noglobal": { + "input": "G-Laa", + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "G-Laa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + } + }, + "case_multivar_same_sameglobal": { + "input": "GaLaa", + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaLaa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLaa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLaa" + ] + } + }, + "case_multivar_same_diffglobal": { + "input": "GaLbb", + "localstyle": { + "unsplit": [ + "G-Lbb" + ], + "split": [ + "GaLbb" + ] + }, + "globalstyle": { + "unsplit": [ + "GbL--" + ], + "split": [ + "GaLbb" + ] + }, + "userstyle": { + "unsplit": [ + "GbL--" + ], + "split": [ + "GaLbb" + ] + } + }, + "case_multivar_differ_noglobal": { + "input": "G-Lab", + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multivar_differ_diffglobal": { + "input": "GaLbc", + "localstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + } + }, + "case_multivar_differ_sameglobal": { + "input": "GaLab", + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + } + }, + "case_multivar_1none_noglobal": { + "input": "G-La-", + "localstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "userstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + } + }, + "case_multivar_1none_diffglobal": { + "input": "GaLb-", + "localstyle": { + "unsplit": [ + "G-Lba" + ], + "split": [ + "GaLb-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lba" + ], + "split": [ + "GaLb-" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lba" + ], + "split": [ + "GaLb-" + ] + } + }, + "case_multivar_1none_sameglobal": { + "input": "GaLa-", + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaLa-" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLa-" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLa-" + ] + } + }, + "case_multisource_gsame_lnone": { + "input": [ + "GaL-", + "GaL-" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaL--" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaL--" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaL--" + ] + } + }, + "case_multisource_gsame_lallsame": { + "input": [ + "GaLa", + "GaLa" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaLaa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLaa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLaa" + ] + } + }, + "case_multisource_gsame_l1same1none": { + "input": [ + "GaLa", + "GaL-" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "GaLa-" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLa-" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "GaLa-" + ] + } + }, + "case_multisource_gsame_l1same1other": { + "input": [ + "GaLa", + "GaLb" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "GaLab" + ] + } + }, + "case_multisource_gsame_lallother": { + "input": [ + "GaLb", + "GaLb" + ], + "localstyle": { + "unsplit": [ + "G-Lbb" + ], + "split": [ + "GaLbb" + ] + }, + "globalstyle": { + "unsplit": [ + "GbL--" + ], + "split": [ + "GaLbb" + ] + }, + "userstyle": { + "unsplit": [ + "GbL--" + ], + "split": [ + "GaLbb" + ] + } + }, + "case_multisource_gsame_lalldiffer": { + "input": [ + "GaLb", + "GaLc" + ], + "localstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "GaLbc" + ] + } + }, + "case_multisource_gnone_l1one1none": { + "input": [ + "G-La", + "G-L-" + ], + "localstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "userstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + } + }, + "case_multisource_gnone_l1one1same": { + "input": [ + "G-La", + "G-La" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "G-Laa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + } + }, + "case_multisource_gnone_l1one1other": { + "input": [ + "G-La", + "G-Lb" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_g1none_lnone": { + "input": [ + "GaL-", + "G-L-" + ], + "localstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "userstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + } + }, + "case_multisource_g1none_l1same1none": { + "input": [ + "GaLa", + "G-L-" + ], + "localstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + }, + "userstyle": { + "unsplit": [ + "G-La-" + ], + "split": [ + "G-La-" + ] + } + }, + "case_multisource_g1none_l1none1same": { + "input": [ + "GaL-", + "G-La" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "G-Laa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + } + }, + "case_multisource_g1none_l1diff1none": { + "input": [ + "GaLb", + "G-L-" + ], + "localstyle": { + "unsplit": [ + "G-Lb-" + ], + "split": [ + "G-Lb-" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lb-" + ], + "split": [ + "G-Lb-" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lb-" + ], + "split": [ + "G-Lb-" + ] + } + }, + "case_multisource_g1none_l1none1diff": { + "input": [ + "GaL-", + "G-Lb" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_g1none_lallsame": { + "input": [ + "GaLa", + "G-La" + ], + "localstyle": { + "unsplit": [ + "G-Laa" + ], + "split": [ + "G-Laa" + ] + }, + "globalstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + }, + "userstyle": { + "unsplit": [ + "GaL--" + ], + "split": [ + "G-Laa" + ] + } + }, + "case_multisource_g1none_lallother": { + "input": [ + "GaLc", + "G-Lc" + ], + "localstyle": { + "unsplit": [ + "G-Lcc" + ], + "split": [ + "G-Lcc" + ] + }, + "globalstyle": { + "unsplit": [ + "GcL--" + ], + "split": [ + "G-Lcc" + ] + }, + "userstyle": { + "unsplit": [ + "GcL--" + ], + "split": [ + "G-Lcc" + ] + } + }, + "case_multisource_gdiff_lnone": { + "input": [ + "GaL-", + "GbL-" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_gdiff_l1same1none": { + "input": [ + "GaLa", + "GbL-" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_gdiff_l1diff1none": { + "input": [ + "GaLb", + "GcL-" + ], + "localstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "G-Lbc" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "G-Lbc" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lbc" + ], + "split": [ + "G-Lbc" + ] + } + }, + "case_multisource_gdiff_lallsame": { + "input": [ + "GaLa", + "GbLb" + ], + "localstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "globalstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + }, + "userstyle": { + "unsplit": [ + "G-Lab" + ], + "split": [ + "G-Lab" + ] + } + }, + "case_multisource_gdiff_lallother": { + "input": [ + "GaLc", + "GbLc" + ], + "localstyle": { + "unsplit": [ + "G-Lcc" + ], + "split": [ + "G-Lcc" + ] + }, + "globalstyle": { + "unsplit": [ + "GcL--" + ], + "split": [ + "G-Lcc" + ] + }, + "userstyle": { + "unsplit": [ + "GcL--" + ], + "split": [ + "G-Lcc" + ] + } + } +} \ No newline at end of file diff --git a/lib/iris/tests/integration/netcdf/test_delayed_save.py b/lib/iris/tests/integration/netcdf/test_delayed_save.py index d3f2ce22c4..c8c218000c 100644 --- a/lib/iris/tests/integration/netcdf/test_delayed_save.py +++ b/lib/iris/tests/integration/netcdf/test_delayed_save.py @@ -23,6 +23,13 @@ class Test__lazy_stream_data: + # Ensure all saves are done with split-atttribute saving, + # -- because some of these tests are sensitive to unexpected warnings. + @pytest.fixture(autouse=True) + def all_saves_with_split_attrs(self): + with iris.FUTURE.context(save_split_attrs=True): + yield + @pytest.fixture(autouse=True) def output_path(self, tmp_path): # A temporary output netcdf-file path, **unique to each test call**. diff --git a/lib/iris/tests/integration/test_netcdf__loadsaveattrs.py b/lib/iris/tests/integration/test_netcdf__loadsaveattrs.py new file mode 100644 index 0000000000..b09b408827 --- /dev/null +++ b/lib/iris/tests/integration/test_netcdf__loadsaveattrs.py @@ -0,0 +1,1678 @@ +# Copyright Iris contributors +# +# This file is part of Iris and is released under the BSD license. +# See LICENSE in the root of the repository for full licensing details. +""" +Integration tests for loading and saving netcdf file attributes. + +Notes: +(1) attributes in netCDF files can be either "global attributes", or variable +("local") type. + +(2) in CF terms, this testcode classifies specific attributes (names) as either +"global" = names recognised by convention as normally stored in a file-global +setting; "local" = recognised names specifying details of variable data +encoding, which only make sense as a "local" attribute (i.e. on a variable), +and "user" = any additional attributes *not* recognised in conventions, which +might be recorded either globally or locally. + +""" +import inspect +import json +import os +from pathlib import Path +import re +from typing import Iterable, List, Optional, Union +import warnings + +import numpy as np +import pytest + +import iris +import iris.coord_systems +from iris.coords import DimCoord +from iris.cube import Cube +import iris.fileformats.netcdf +import iris.fileformats.netcdf._thread_safe_nc as threadsafe_nc4 + +# First define the known controlled attribute names defined by netCDf and CF conventions +# +# Note: certain attributes are "normally" global (e.g. "Conventions"), whilst others +# will only usually appear on a data-variable (e.g. "scale_factor"", "coordinates"). +# I'm calling these 'global-style' and 'local-style'. +# Any attributes either belongs to one of these 2 groups, or neither. Those 3 distinct +# types may then have different behaviour in Iris load + save. + +# A list of "global-style" attribute names : those which should be global attributes by +# default (i.e. file- or group-level, *not* attached to a variable). + +_GLOBAL_TEST_ATTRS = set(iris.fileformats.netcdf.saver._CF_GLOBAL_ATTRS) +# Remove this one, which has peculiar behaviour + is tested separately +# N.B. this is not the same as 'Conventions', but is caught in the crossfire when that +# one is processed. +_GLOBAL_TEST_ATTRS -= set(["conventions"]) +_GLOBAL_TEST_ATTRS = sorted(_GLOBAL_TEST_ATTRS) + + +# Define a fixture to parametrise tests over the 'global-style' test attributes. +# This just provides a more concise way of writing parametrised tests. +@pytest.fixture(params=_GLOBAL_TEST_ATTRS) +def global_attr(request): + # N.B. "request" is a standard PyTest fixture + return request.param # Return the name of the attribute to test. + + +# A list of "local-style" attribute names : those which should be variable attributes +# by default (aka "local", "variable" or "data" attributes) . +_LOCAL_TEST_ATTRS = ( + iris.fileformats.netcdf.saver._CF_DATA_ATTRS + + iris.fileformats.netcdf.saver._UKMO_DATA_ATTRS +) + + +# Define a fixture to parametrise over the 'local-style' test attributes. +# This just provides a more concise way of writing parametrised tests. +@pytest.fixture(params=_LOCAL_TEST_ATTRS) +def local_attr(request): + # N.B. "request" is a standard PyTest fixture + return request.param # Return the name of the attribute to test. + + +# Define whether to parametrise over split-attribute saving +# Just for now, so that we can run against legacy code. +_SPLIT_SAVE_SUPPORTED = hasattr(iris.FUTURE, "save_split_attrs") +_SPLIT_PARAM_VALUES = [False, True] +_SPLIT_PARAM_IDS = ["nosplit", "split"] +_MATRIX_LOAD_RESULTSTYLES = ["legacy", "newstyle"] +if not _SPLIT_SAVE_SUPPORTED: + _SPLIT_PARAM_VALUES.remove(True) + _SPLIT_PARAM_IDS.remove("split") + _MATRIX_LOAD_RESULTSTYLES.remove("newstyle") + + +_SKIP_WARNCHECK = "_no_warnings_check" + + +def check_captured_warnings( + expected_keys: List[str], + captured_warnings: List[warnings.WarningMessage], + allow_possible_legacy_warning: bool = False, +): + """ + Compare captured warning messages with a list of regexp-matches. + + We allow them to occur in any order, and replace each actual result in the list + with its matching regexp, if any, as this makes failure results much easier to + comprehend. + + """ + # TODO: when iris.FUTURE.save_split_attrs is removed, we can remove the + # 'allow_possible_legacy_warning' arg. + + if expected_keys is None: + expected_keys = [] + elif hasattr(expected_keys, "upper"): + # Handle a single string + if expected_keys == _SKIP_WARNCHECK: + # No check at all in this case + return + expected_keys = [expected_keys] + + if allow_possible_legacy_warning: + # Allow but do not require a "saving without split-attributes" warning. + legacy_message_key = ( + "Saving to netcdf with legacy-style attribute handling for backwards " + "compatibility." + ) + expected_keys.append(legacy_message_key) + + expected_keys = [re.compile(key) for key in expected_keys] + found_results = [str(warning.message) for warning in captured_warnings] + remaining_keys = expected_keys.copy() + for i_message, message in enumerate(found_results.copy()): + for key in remaining_keys: + if key.search(message): + # Hit : replace one message in the list with its matching "key" + found_results[i_message] = key + # remove the matching key + remaining_keys.remove(key) + # skip on to next message + break + + if allow_possible_legacy_warning: + # Remove any unused "legacy attribute saving" key. + # N.B. this is the *only* key we will tolerate not being used. + expected_keys = [ + key for key in expected_keys if key != legacy_message_key + ] + + assert set(found_results) == set(expected_keys) + + +class MixinAttrsTesting: + @staticmethod + def _calling_testname(): + """ + Search up the callstack for a function named "test_*", and return the name for + use as a test identifier. + + Idea borrowed from :meth:`iris.tests.IrisTest.result_path`. + + Returns + ------- + test_name : str + Returns a string, with the initial "test_" removed. + """ + test_name = None + stack = inspect.stack() + for frame in stack[1:]: + full_name = frame[3] + if full_name.startswith("test_"): + # Return the name with the initial "test_" removed. + test_name = full_name.replace("test_", "") + break + # Search should not fail, unless we were called from an inappropriate place? + assert test_name is not None + return test_name + + @pytest.fixture(autouse=True) + def make_tempdir(self, tmp_path_factory): + """ + Automatically-run fixture to activate the 'tmp_path_factory' fixture on *every* + test: Make a directory for temporary files, and record it on the test instance. + + N.B. "tmp_path_factory" is a standard PyTest fixture, which provides a dirpath + *shared* by all tests. This is a bit quicker and more debuggable than having a + directory per-testcase. + """ + # Store the temporary directory path on the test instance + self.tmpdir = str(tmp_path_factory.getbasetemp()) + + def _testfile_path(self, basename: str) -> str: + # Make a filepath in the temporary directory, based on the name of the calling + # test method, and the "self.attrname" it sets up. + testname = self._calling_testname() + # Turn that into a suitable temporary filename + ext_name = getattr(self, "testname_extension", "") + if ext_name: + basename = basename + "_" + ext_name + path_str = f"{self.tmpdir}/{self.__class__.__name__}__test_{testname}-{self.attrname}__{basename}.nc" + return path_str + + @staticmethod + def _default_vars_and_attrvalues(vars_and_attrvalues): + # Simple default strategy : turn a simple value into {'var': value} + if not isinstance(vars_and_attrvalues, dict): + # Treat single non-dict argument as a value for a single variable + vars_and_attrvalues = {"var": vars_and_attrvalues} + return vars_and_attrvalues + + def create_testcase_files_or_cubes( + self, + attr_name: str, + global_value_file1: Optional[str] = None, + var_values_file1: Union[None, str, dict] = None, + global_value_file2: Optional[str] = None, + var_values_file2: Union[None, str, dict] = None, + cubes: bool = False, + ): + """ + Create temporary input netcdf files, or cubes, with specific content. + + Creates a temporary netcdf test file (or two) with the given global and + variable-local attributes. Or build cubes, similarly. + If ``cubes`` is ``True``, save cubes in ``self.input_cubes``. + Else save filepaths in ``self.input_filepaths``. + + Note: 'var_values_file' args are dictionaries. The named variables are + created, with an attribute = the dictionary value, *except* that a dictionary + value of None means that a local attribute is _not_ created on the variable. + """ + # save attribute on the instance + self.attrname = attr_name + + if not cubes: + # Make some input file paths. + filepath1 = self._testfile_path("testfile") + filepath2 = self._testfile_path("testfile2") + + def make_file( + filepath: str, global_value=None, var_values=None + ) -> str: + ds = threadsafe_nc4.DatasetWrapper(filepath, "w") + if global_value is not None: + ds.setncattr(attr_name, global_value) + ds.createDimension("x", 3) + # Rationalise the per-variable requirements + # N.B. this *always* makes at least one variable, as otherwise we would + # load no cubes. + var_values = self._default_vars_and_attrvalues(var_values) + for var_name, value in var_values.items(): + v = ds.createVariable(var_name, int, ("x",)) + if value is not None: + v.setncattr(attr_name, value) + ds.close() + return filepath + + def make_cubes(var_name, global_value=None, var_values=None): + cubes = [] + var_values = self._default_vars_and_attrvalues(var_values) + for varname, local_value in var_values.items(): + cube = Cube(np.arange(3.0), var_name=var_name) + cubes.append(cube) + dimco = DimCoord(np.arange(3.0), var_name="x") + cube.add_dim_coord(dimco, 0) + if not hasattr(cube.attributes, "globals"): + # N.B. For now, also support oldstyle "single" cube attribute + # dictionaries, so that we can generate legacy results to compore + # with the "new world" results. + single_value = global_value + if local_value is not None: + single_value = local_value + if single_value is not None: + cube.attributes[attr_name] = single_value + else: + if global_value is not None: + cube.attributes.globals[attr_name] = global_value + if local_value is not None: + cube.attributes.locals[attr_name] = local_value + return cubes + + if cubes: + results = make_cubes("v1", global_value_file1, var_values_file1) + if global_value_file2 is not None or var_values_file2 is not None: + results.extend( + make_cubes("v2", global_value_file2, var_values_file2) + ) + else: + results = [ + make_file(filepath1, global_value_file1, var_values_file1) + ] + if global_value_file2 is not None or var_values_file2 is not None: + # Make a second testfile and add it to files-to-be-loaded. + results.append( + make_file(filepath2, global_value_file2, var_values_file2) + ) + + # Save results on the instance + if cubes: + self.input_cubes = results + else: + self.input_filepaths = results + return results + + def run_testcase( + self, + attr_name: str, + values: Union[List, List[List]], + create_cubes_or_files: str = "files", + ) -> None: + """ + Create testcase inputs (files or cubes) with specified attributes. + + Parameters + ---------- + attr_name : str + name for all attributes created in this testcase. + Also saved as ``self.attrname``, as used by ``fetch_results``. + values : list + list, or lists, of values for created attributes, each containing one global + and one-or-more local attribute values as [global, local1, local2...] + create_cubes_or_files : str, default "files" + create either cubes or testfiles. + + If ``create_cubes_or_files`` == "files", create one temporary netCDF file per + values-list, and record in ``self.input_filepaths``. + Else if ``create_cubes_or_files`` == "cubes", create sets of cubes with common + global values and store all of them to ``self.input_cubes``. + + """ + # Save common attribute-name on the instance + self.attrname = attr_name + + # Standardise input to a list-of-lists, each inner list = [global, *locals] + assert isinstance(values, list) + if not isinstance(values[0], list): + values = [values] + assert len(values) in (1, 2) + assert len(values[0]) > 1 + + # Decode into global1, *locals1, and optionally global2, *locals2 + global1 = values[0][0] + vars1 = {} + i_var = 0 + for value in values[0][1:]: + vars1[f"var_{i_var}"] = value + i_var += 1 + if len(values) == 1: + global2 = None + vars2 = None + else: + assert len(values) == 2 + global2 = values[1][0] + vars2 = {} + for value in values[1][1:]: + vars2[f"var_{i_var}"] = value + i_var += 1 + + # Create test files or cubes (and store data on the instance) + assert create_cubes_or_files in ("cubes", "files") + make_cubes = create_cubes_or_files == "cubes" + self.create_testcase_files_or_cubes( + attr_name=attr_name, + global_value_file1=global1, + var_values_file1=vars1, + global_value_file2=global2, + var_values_file2=vars2, + cubes=make_cubes, + ) + + def fetch_results( + self, + filepath: str = None, + cubes: Iterable[Cube] = None, + oldstyle_combined: bool = False, + ): + """ + Return testcase results from an output file or cubes in a standardised form. + + Unpick the global+local values of the attribute ``self.attrname``, resulting + from a test operation. + A file result is always [global_value, *local_values] + A cubes result is [*[global_value, *local_values]] (over different global vals) + + When ``oldstyle_combined`` is ``True``, simulate the "legacy" style results, + that is when each cube had a single combined attribute dictionary. + This enables us to check against former behaviour, by combining results into a + single dictionary. N.B. per-cube single results are then returned in the form: + [None, cube1, cube2...]. + N.B. if results are from a *file*, this key has **no effect**. + + """ + attr_name = self.attrname + if filepath is not None: + # Fetch global and local values from a file + try: + ds = threadsafe_nc4.DatasetWrapper(filepath) + global_result = ( + ds.getncattr(attr_name) + if attr_name in ds.ncattrs() + else None + ) + # Fetch local attr value from all data variables : In our testcases, + # that is all *except* dimcoords (ones named after dimensions). + local_vars_results = [ + ( + var.name, + ( + var.getncattr(attr_name) + if attr_name in var.ncattrs() + else None + ), + ) + for var in ds.variables.values() + if var.name not in ds.dimensions + ] + finally: + ds.close() + # This version always returns a single result set [global, local1[, local2]] + # Return global, plus locals sorted by varname + local_vars_results = sorted(local_vars_results, key=lambda x: x[0]) + results = [global_result] + [val for _, val in local_vars_results] + else: + assert cubes is not None + # Sort result cubes according to a standard ordering. + cubes = sorted(cubes, key=lambda cube: cube.name()) + # Fetch globals and locals from cubes. + # This way returns *multiple* result 'sets', one for each global value + if oldstyle_combined or not _SPLIT_SAVE_SUPPORTED: + # Use all-combined dictionaries in place of actual cubes' attributes + cube_attr_dicts = [dict(cube.attributes) for cube in cubes] + # Return results as if all cubes had global=None + results = [ + [None] + + [ + cube_attr_dict.get(attr_name, None) + for cube_attr_dict in cube_attr_dicts + ] + ] + else: + # Return a result-set for each occurring global value (possibly + # including a 'None'). + global_values = set( + cube.attributes.globals.get(attr_name, None) + for cube in cubes + ) + results = [ + [globalval] + + [ + cube.attributes.locals.get(attr_name, None) + for cube in cubes + if cube.attributes.globals.get(attr_name, None) + == globalval + ] + for globalval in sorted(global_values, key=str) + ] + return results + + +# Define all the testcases for different parameter input structures : +# - combinations of matching+differing, global+local params +# - these are interpreted differently for the 3 main test types : Load/Save/Roundtrip +_MATRIX_TESTCASE_INPUTS = { + "case_singlevar_localonly": "G-La", + "case_singlevar_globalonly": "GaL-", + "case_singlevar_glsame": "GaLa", + "case_singlevar_gldiffer": "GaLb", + "case_multivar_same_noglobal": "G-Laa", + "case_multivar_same_sameglobal": "GaLaa", + "case_multivar_same_diffglobal": "GaLbb", + "case_multivar_differ_noglobal": "G-Lab", + "case_multivar_differ_diffglobal": "GaLbc", + "case_multivar_differ_sameglobal": "GaLab", + "case_multivar_1none_noglobal": "G-La-", + "case_multivar_1none_diffglobal": "GaLb-", + "case_multivar_1none_sameglobal": "GaLa-", + # Note: the multi-set input cases are more complex. + # These are encoded as *pairs* of specs, for 2 different files, or cubes with + # independent global values. + # We assume that there can be nothing "special" about a var's interaction with + # another one from the same (as opposed to the "other") file. + "case_multisource_gsame_lnone": ["GaL-", "GaL-"], + "case_multisource_gsame_lallsame": ["GaLa", "GaLa"], + "case_multisource_gsame_l1same1none": ["GaLa", "GaL-"], + "case_multisource_gsame_l1same1other": ["GaLa", "GaLb"], + "case_multisource_gsame_lallother": ["GaLb", "GaLb"], + "case_multisource_gsame_lalldiffer": ["GaLb", "GaLc"], + "case_multisource_gnone_l1one1none": ["G-La", "G-L-"], + "case_multisource_gnone_l1one1same": ["G-La", "G-La"], + "case_multisource_gnone_l1one1other": ["G-La", "G-Lb"], + "case_multisource_g1none_lnone": ["GaL-", "G-L-"], + "case_multisource_g1none_l1same1none": ["GaLa", "G-L-"], + "case_multisource_g1none_l1none1same": ["GaL-", "G-La"], + "case_multisource_g1none_l1diff1none": ["GaLb", "G-L-"], + "case_multisource_g1none_l1none1diff": ["GaL-", "G-Lb"], + "case_multisource_g1none_lallsame": ["GaLa", "G-La"], + "case_multisource_g1none_lallother": ["GaLc", "G-Lc"], + "case_multisource_gdiff_lnone": ["GaL-", "GbL-"], + "case_multisource_gdiff_l1same1none": ["GaLa", "GbL-"], + "case_multisource_gdiff_l1diff1none": ["GaLb", "GcL-"], + "case_multisource_gdiff_lallsame": ["GaLa", "GbLb"], + "case_multisource_gdiff_lallother": ["GaLc", "GbLc"], +} +_MATRIX_TESTCASES = list(_MATRIX_TESTCASE_INPUTS.keys()) + +# +# Define the attrs against which all matrix tests are run +# +max_param_attrs = None +# max_param_attrs = 5 + +_MATRIX_ATTRNAMES = _LOCAL_TEST_ATTRS[:max_param_attrs] +_MATRIX_ATTRNAMES += _GLOBAL_TEST_ATTRS[:max_param_attrs] +_MATRIX_ATTRNAMES += ["user"] + +# remove special-cases, for now : all these behave irregularly (i.e. unlike the known +# "globalstyle", or "localstyle" generic cases). +# N.B. not including "Conventions", which is not in the globals list, so won't be +# matrix-tested unless we add it specifically. +# TODO: decide if any of these need to be tested, as separate test-styles. +_SPECIAL_ATTRS = [ + "ukmo__process_flags", + "missing_value", + "standard_error_multiplier", + "STASH", + "um_stash_source", +] +_MATRIX_ATTRNAMES = [ + attr for attr in _MATRIX_ATTRNAMES if attr not in _SPECIAL_ATTRS +] + + +# +# A routine to work "backwards" from an attribute name to its "style", i.e. type category. +# Possible styles are "globalstyle", "localstyle", "userstyle". +# +_ATTR_STYLES = ["localstyle", "globalstyle", "userstyle"] + + +def deduce_attr_style(attrname: str) -> str: + # Extract the attribute "style type" from an attr_param name + if attrname in _LOCAL_TEST_ATTRS: + style = "localstyle" + elif attrname in _GLOBAL_TEST_ATTRS: + style = "globalstyle" + else: + assert attrname == "user" + style = "userstyle" + return style + + +# +# Decode a matrix "input spec" to codes for global + local values. +# +def decode_matrix_input(input_spec): + # Decode a matrix-test input specification, like "GaLbc", into lists of values. + # E.G. "GaLbc" -> ["a", "b", "c"] + # ["GaLbc", "GbLbc"] -> [["a", "b", "c"], ["b", "b", c"]] + # N.B. in this form "values" are all one-character strings. + def decode_specstring(spec: str) -> List[Union[str, None]]: + # Decode an input spec-string to input/output attribute values + assert spec[0] == "G" and spec[2] == "L" + allvals = spec[1] + spec[3:] + result = [None if valchar == "-" else valchar for valchar in allvals] + return result + + if isinstance(input_spec, str): + # Single-source spec (one cube or one file) + vals = decode_specstring(input_spec) + result = [vals] + else: + # Dual-source spec (two files, or sets of cubes with a common global value) + vals_A = decode_specstring(input_spec[0]) + vals_B = decode_specstring(input_spec[1]) + result = [vals_A, vals_B] + + return result + + +def encode_matrix_result(results: List[List[str]]) -> List[str]: + # Re-code a set of output results, [*[global-value, *local-values]] as a list of + # strings, like ["GaL-b"] or ["GaLabc", "GbLabc"]. + # N.B. again assuming that all values are just one-character strings, or None. + assert isinstance(results, Iterable) and len(results) >= 1 + if not isinstance(results[0], list): + results = [results] + assert all( + all(val is None or isinstance(val, str) for val in vals) + for vals in results + ) + + # Translate "None" values to "-" + def valrep(val): + return "-" if val is None else val + + results = list( + "".join(["G", valrep(vals[0]), "L"] + list(map(valrep, vals[1:]))) + for vals in results + ) + return results + + +# +# The "expected" matrix test results are stored in JSON files (one for each test-type). +# We can also save the found results. +# +_MATRIX_TESTTYPES = ("load", "save", "roundtrip") + + +@pytest.fixture(autouse=True, scope="session") +def matrix_results(): + matrix_filepaths = { + testtype: ( + Path(__file__).parent / f"attrs_matrix_results_{testtype}.json" + ) + for testtype in _MATRIX_TESTTYPES + } + # An environment variable can trigger saving of the results. + save_matrix_results = bool( + int(os.environ.get("SAVEALL_MATRIX_RESULTS", "0")) + ) + + matrix_results = {} + for testtype in _MATRIX_TESTTYPES: + # Either fetch from file, or initialise, a results matrix for each test type + # (load/save/roundtrip). + input_path = matrix_filepaths[testtype] + if input_path.exists(): + # Load from file with json. + with open(input_path) as file_in: + testtype_results = json.load(file_in) + # Check compatibility (in case we changed the test-specs list) + assert set(testtype_results.keys()) == set(_MATRIX_TESTCASES) + assert all( + testtype_results[key]["input"] == _MATRIX_TESTCASE_INPUTS[key] + for key in _MATRIX_TESTCASES + ) + else: + # Create empty matrix results content (for one test-type) + testtype_results = {} + for testcase in _MATRIX_TESTCASES: + test_case_results = {} + testtype_results[testcase] = test_case_results + # Every testcase dict has an "input" slot with the test input spec, + # basically just to help human readability. + test_case_results["input"] = _MATRIX_TESTCASE_INPUTS[testcase] + for attrstyle in _ATTR_STYLES: + if testtype == "load": + # "load" test results have a "legacy" result (as for a single + # combined attrs dictionary), and a "newstyle" result (with + # the new split dictionary). + test_case_results[attrstyle] = { + "legacy": None, + "newstyle": None, + } + else: + # "save"/"roundtrip"-type results record 2 result sets, + # (unsplit/split) for each attribute-style + # - i.e. when saved without/with split_attrs_saving enabled. + test_case_results[attrstyle] = { + "unsplit": None, + "split": None, + } + + # Build complete data: matrix_results[TESTTYPES][TESTCASES][ATTR_STYLES] + matrix_results[testtype] = testtype_results + + # Pass through to all the tests : they can also update it, if enabled. + yield save_matrix_results, matrix_results + + if save_matrix_results: + for testtype in _MATRIX_TESTTYPES: + output_path = matrix_filepaths[testtype] + results = matrix_results[testtype] + with open(output_path, "w") as file_out: + json.dump(results, file_out, indent=2) + + +class TestRoundtrip(MixinAttrsTesting): + """ + Test handling of attributes in roundtrip netcdf-iris-netcdf. + + This behaviour should be (almost) unchanged by the adoption of + split-attribute handling. + + NOTE: the tested combinations in the 'TestLoad' test all match tests here, but not + *all* of the tests here are useful there. To avoid confusion (!) the ones which are + paralleled in TestLoad there have the identical test-names. However, as the tests + are all numbered that means there are missing numbers there. + The tests are numbered only so it is easier to review the discovered test list + (which is sorted). + + """ + + # Parametrise all tests over split/unsplit saving. + @pytest.fixture( + params=_SPLIT_PARAM_VALUES, ids=_SPLIT_PARAM_IDS, autouse=True + ) + def do_split(self, request): + do_split = request.param + self.save_split_attrs = do_split + return do_split + + def run_roundtrip_testcase(self, attr_name, values): + """ + Initialise the testcase from the passed-in controls, configure the input + files and run a save-load roundtrip to produce the output file. + + The name of the attribute, and the input and output temporary filepaths are + stored on the instance, where "self.check_roundtrip_results()" can get them. + + """ + self.run_testcase( + attr_name=attr_name, values=values, create_cubes_or_files="files" + ) + self.result_filepath = self._testfile_path("result") + + with warnings.catch_warnings(record=True) as captured_warnings: + # Do a load+save to produce a testable output result in a new file. + cubes = iris.load(self.input_filepaths) + # Ensure stable result order. + cubes = sorted(cubes, key=lambda cube: cube.name()) + do_split = getattr(self, "save_split_attrs", False) + kwargs = ( + dict(save_split_attrs=do_split) + if _SPLIT_SAVE_SUPPORTED + else dict() + ) + with iris.FUTURE.context(**kwargs): + iris.save(cubes, self.result_filepath) + + self.captured_warnings = captured_warnings + + def check_roundtrip_results(self, expected, expected_warnings=None): + """ + Run checks on the generated output file. + + The counterpart to :meth:`run_roundtrip_testcase`, with similar arguments. + Check existence (or not) of a global attribute, and a number of local + (variable) attributes. + Values of 'None' mean to check that the relevant global/local attribute does + *not* exist. + + Also check the warnings captured during the testcase run. + """ + # N.B. there is only ever one result-file, but it can contain various variables + # which came from different input files. + results = self.fetch_results(filepath=self.result_filepath) + assert results == expected + check_captured_warnings( + expected_warnings, + self.captured_warnings, + # N.B. only allow a legacy-attributes warning when NOT saving split-attrs + allow_possible_legacy_warning=not self.save_split_attrs, + ) + + ####################################################### + # Tests on "user-style" attributes. + # This means any arbitrary attribute which a user might have added -- i.e. one with + # a name which is *not* recognised in the netCDF or CF conventions. + # + + def test_01_userstyle_single_global(self): + self.run_roundtrip_testcase( + attr_name="myname", values=["single-value", None] + ) + # Default behaviour for a general global user-attribute. + # It simply remains global. + self.check_roundtrip_results(["single-value", None]) + + def test_02_userstyle_single_local(self, do_split): + # Default behaviour for a general local user-attribute. + # It results in a "promoted" global attribute. + self.run_roundtrip_testcase( + attr_name="myname", # A generic "user" attribute with no special handling + values=[None, "single-value"], + ) + if do_split: + expected = [None, "single-value"] + else: + expected = ["single-value", None] + self.check_roundtrip_results(expected) + + def test_03_userstyle_multiple_different(self, do_split): + # Default behaviour for general user-attributes. + # The global attribute is lost because there are local ones. + self.run_roundtrip_testcase( + attr_name="random", # A generic "user" attribute with no special handling + values=[ + ["common_global", "f1v1", "f1v2"], + ["common_global", "x1", "x2"], + ], + ) + expected_result = ["common_global", "f1v1", "f1v2", "x1", "x2"] + if not do_split: + # in legacy mode, global is lost + expected_result[0] = None + # just check they are all there and distinct + self.check_roundtrip_results(expected_result) + + def test_04_userstyle_matching_promoted(self, do_split): + # matching local user-attributes are "promoted" to a global one. + # (but not when saving split attributes) + input_values = ["global_file1", "same-value", "same-value"] + self.run_roundtrip_testcase( + attr_name="random", + values=input_values, + ) + if do_split: + expected = input_values + else: + expected = ["same-value", None, None] + self.check_roundtrip_results(expected) + + def test_05_userstyle_matching_crossfile_promoted(self, do_split): + # matching user-attributes are promoted, even across input files. + # (but not when saving split attributes) + self.run_roundtrip_testcase( + attr_name="random", + values=[ + ["global_file1", "same-value", "same-value"], + [None, "same-value", "same-value"], + ], + ) + if do_split: + # newstyle saves: locals are preserved, mismathced global is *lost* + expected_result = [ + None, + "same-value", + "same-value", + "same-value", + "same-value", + ] + # warnings about the clash + expected_warnings = [ + "Saving.* global attributes.* as local", + 'attributes.* of cube "var_0" were not saved', + 'attributes.* of cube "var_1" were not saved', + ] + else: + # oldstyle saves: matching locals promoted, override original global + expected_result = ["same-value", None, None, None, None] + expected_warnings = None + + self.check_roundtrip_results(expected_result, expected_warnings) + + def test_06_userstyle_nonmatching_remainlocal(self, do_split): + # Non-matching user attributes remain 'local' to the individual variables. + input_values = ["global_file1", "value-1", "value-2"] + if do_split: + # originals are preserved + expected_result = input_values + else: + # global is lost + expected_result = [None, "value-1", "value-2"] + self.run_roundtrip_testcase(attr_name="random", values=input_values) + self.check_roundtrip_results(expected_result) + + ####################################################### + # Tests on "Conventions" attribute. + # Note: the usual 'Conventions' behaviour is already tested elsewhere + # - see :class:`TestConventionsAttributes` above + # + # TODO: the name 'conventions' (lower-case) is also listed in _CF_GLOBAL_ATTRS, but + # we have excluded it from the global-attrs testing here. We probably still need to + # test what that does, though it's inclusion might simply be a mistake. + # + + def test_07_conventions_var_local(self): + # What happens if 'Conventions' appears as a variable-local attribute. + # N.B. this is not good CF, but we'll see what happens anyway. + self.run_roundtrip_testcase( + attr_name="Conventions", + values=[None, "user_set"], + ) + self.check_roundtrip_results(["CF-1.7", None]) + + def test_08_conventions_var_both(self): + # What happens if 'Conventions' appears as both global + local attribute. + self.run_roundtrip_testcase( + attr_name="Conventions", + values=["global-setting", "local-setting"], + ) + # standard content from Iris save + self.check_roundtrip_results(["CF-1.7", None]) + + ####################################################### + # Tests on "global" style attributes + # = those specific ones which 'ought' only to be global (except on collisions) + # + def test_09_globalstyle__global(self, global_attr): + attr_content = f"Global tracked {global_attr}" + self.run_roundtrip_testcase( + attr_name=global_attr, + values=[attr_content, None], + ) + self.check_roundtrip_results([attr_content, None]) + + def test_10_globalstyle__local(self, global_attr, do_split): + # Strictly, not correct CF, but let's see what it does with it. + attr_content = f"Local tracked {global_attr}" + input_values = [None, attr_content] + self.run_roundtrip_testcase( + attr_name=global_attr, + values=input_values, + ) + if do_split: + # remains local as supplied, but there is a warning + expected_result = input_values + expected_warning = f"'{global_attr}'.* should only be a CF global" + else: + # promoted to global + expected_result = [attr_content, None] + expected_warning = None + self.check_roundtrip_results(expected_result, expected_warning) + + def test_11_globalstyle__both(self, global_attr, do_split): + attr_global = f"Global-{global_attr}" + attr_local = f"Local-{global_attr}" + input_values = [attr_global, attr_local] + self.run_roundtrip_testcase( + attr_name=global_attr, + values=input_values, + ) + if do_split: + # remains local as supplied, but there is a warning + expected_result = input_values + expected_warning = "should only be a CF global" + else: + # promoted to global, no local value, original global lost + expected_result = [attr_local, None] + expected_warning = None + self.check_roundtrip_results(expected_result, expected_warning) + + def test_12_globalstyle__multivar_different(self, global_attr): + # Multiple *different* local settings are retained, not promoted + attr_1 = f"Local-{global_attr}-1" + attr_2 = f"Local-{global_attr}-2" + expect_warning = "should only be a CF global attribute" + # A warning should be raised when writing the result. + self.run_roundtrip_testcase( + attr_name=global_attr, + values=[None, attr_1, attr_2], + ) + self.check_roundtrip_results([None, attr_1, attr_2], expect_warning) + + def test_13_globalstyle__multivar_same(self, global_attr, do_split): + # Multiple *same* local settings are promoted to a common global one + attrval = f"Locally-defined-{global_attr}" + input_values = [None, attrval, attrval] + self.run_roundtrip_testcase( + attr_name=global_attr, + values=input_values, + ) + if do_split: + # remains local, but with a warning + expected_warning = "should only be a CF global" + expected_result = input_values + else: + # promoted to global + expected_warning = None + expected_result = [attrval, None, None] + self.check_roundtrip_results(expected_result, expected_warning) + + def test_14_globalstyle__multifile_different(self, global_attr, do_split): + # Different global attributes from multiple files are retained as local ones + attr_1 = f"Global-{global_attr}-1" + attr_2 = f"Global-{global_attr}-2" + self.run_roundtrip_testcase( + attr_name=global_attr, + values=[[attr_1, None], [attr_2, None]], + ) + # A warning should be raised when writing the result. + expected_warnings = ["should only be a CF global attribute"] + if do_split: + # An extra warning, only when saving with split-attributes. + expected_warnings = ["Saving.* as local"] + expected_warnings + self.check_roundtrip_results([None, attr_1, attr_2], expected_warnings) + + def test_15_globalstyle__multifile_same(self, global_attr): + # Matching global-type attributes in multiple files are retained as global + attrval = f"Global-{global_attr}" + self.run_roundtrip_testcase( + attr_name=global_attr, values=[[attrval, None], [attrval, None]] + ) + self.check_roundtrip_results([attrval, None, None]) + + ####################################################### + # Tests on "local" style attributes + # = those specific ones which 'ought' to appear attached to a variable, rather than + # being global + # + + @pytest.mark.parametrize("origin_style", ["input_global", "input_local"]) + def test_16_localstyle(self, local_attr, origin_style, do_split): + # local-style attributes should *not* get 'promoted' to global ones + # Set the name extension to avoid tests with different 'style' params having + # collisions over identical testfile names + self.testname_extension = origin_style + + attrval = f"Attr-setting-{local_attr}" + if local_attr == "missing_value": + # Special-cases : 'missing_value' type must be compatible with the variable + attrval = 303 + elif local_attr == "ukmo__process_flags": + # What this does when a GLOBAL attr seems to be weird + unintended. + # 'this' --> 't h i s' + attrval = "process" + # NOTE: it's also supposed to handle vector values - which we are not + # testing. + + # NOTE: results *should* be the same whether the original attribute is written + # as global or a variable attribute + if origin_style == "input_global": + # Record in source as a global attribute + values = [attrval, None] + else: + assert origin_style == "input_local" + # Record in source as a variable-local attribute + values = [None, attrval] + self.run_roundtrip_testcase(attr_name=local_attr, values=values) + + if ( + local_attr in ("missing_value", "standard_error_multiplier") + and origin_style == "input_local" + ): + # These ones are actually discarded by roundtrip. + # Not clear why, but for now this captures the facts. + expect_global = None + expect_var = None + else: + expect_global = None + if ( + local_attr == "ukmo__process_flags" + and origin_style == "input_global" + and not do_split + ): + # This is very odd behaviour + surely unintended. + # It's supposed to handle vector values (which we are not checking). + # But the weird behaviour only applies to the 'global' test, which is + # obviously not normal usage anyway. + attrval = "p r o c e s s" + expect_var = attrval + + if local_attr == "STASH" and ( + origin_style == "input_local" or not do_split + ): + # A special case, output translates this to a different attribute name. + self.attrname = "um_stash_source" + + expected_result = [expect_global, expect_var] + if do_split and origin_style == "input_global": + # The result is simply the "other way around" + expected_result = expected_result[::-1] + self.check_roundtrip_results(expected_result) + + @pytest.mark.parametrize("testcase", _MATRIX_TESTCASES[:max_param_attrs]) + @pytest.mark.parametrize("attrname", _MATRIX_ATTRNAMES) + def test_roundtrip_matrix( + self, testcase, attrname, matrix_results, do_split + ): + do_saves, matrix_results = matrix_results + split_param = "split" if do_split else "unsplit" + testcase_spec = matrix_results["roundtrip"][testcase] + input_spec = testcase_spec["input"] + values = decode_matrix_input(input_spec) + + self.run_roundtrip_testcase(attrname, values) + results = self.fetch_results(filepath=self.result_filepath) + result_spec = encode_matrix_result(results) + + attr_style = deduce_attr_style(attrname) + expected = testcase_spec[attr_style][split_param] + + if do_saves: + testcase_spec[attr_style][split_param] = result_spec + if expected is not None: + assert result_spec == expected + + +class TestLoad(MixinAttrsTesting): + """ + Test loading of file attributes into Iris cube attribute dictionaries. + + Tests loading of various combinations to cube dictionaries, treated as a + single combined result (i.e. not split). This behaviour should be (almost) + conserved with the adoption of split attributes **except possibly for key + orderings** -- i.e. we test only up to dictionary equality. + + NOTE: the tested combinations are identical to the roundtrip test. Test numbering + is kept the same, so some (which are inapplicable for this) are missing. + + """ + + def run_load_testcase(self, attr_name, values): + self.run_testcase( + attr_name=attr_name, values=values, create_cubes_or_files="files" + ) + + def check_load_results(self, expected, oldstyle_combined=False): + if not _SPLIT_SAVE_SUPPORTED and not oldstyle_combined: + # Don't check "newstyle" in the old world -- just skip it. + return + result_cubes = iris.load(self.input_filepaths) + results = self.fetch_results( + cubes=result_cubes, oldstyle_combined=oldstyle_combined + ) + # Standardise expected form to list(lists). + assert isinstance(expected, list) + if not isinstance(expected[0], list): + expected = [expected] + assert results == expected + + ####################################################### + # Tests on "user-style" attributes. + # This means any arbitrary attribute which a user might have added -- i.e. one with + # a name which is *not* recognised in the netCDF or CF conventions. + # + + def test_01_userstyle_single_global(self): + self.run_load_testcase( + attr_name="myname", values=["single_value", None, None] + ) + # Legacy-equivalent result check (single attributes dict per cube) + self.check_load_results( + [None, "single_value", "single_value"], + oldstyle_combined=True, + ) + # Full new-style results check + self.check_load_results(["single_value", None, None]) + + def test_02_userstyle_single_local(self): + # Default behaviour for a general local user-attribute. + # It is attached to only the specific cube. + self.run_load_testcase( + attr_name="myname", # A generic "user" attribute with no special handling + values=[None, "single-value", None], + ) + self.check_load_results( + [None, "single-value", None], oldstyle_combined=True + ) + self.check_load_results([None, "single-value", None]) + + def test_03_userstyle_multiple_different(self): + # Default behaviour for differing local user-attributes. + # The global attribute is simply lost, because there are local ones. + self.run_load_testcase( + attr_name="random", # A generic "user" attribute with no special handling + values=[ + ["global_file1", "f1v1", "f1v2"], + ["global_file2", "x1", "x2"], + ], + ) + self.check_load_results( + [None, "f1v1", "f1v2", "x1", "x2"], + oldstyle_combined=True, + ) + self.check_load_results( + [["global_file1", "f1v1", "f1v2"], ["global_file2", "x1", "x2"]] + ) + + def test_04_userstyle_multiple_same(self): + # Nothing special to note in this case + # TODO: ??remove?? + self.run_load_testcase( + attr_name="random", + values=["global_file1", "same-value", "same-value"], + ) + self.check_load_results( + oldstyle_combined=True, expected=[None, "same-value", "same-value"] + ) + self.check_load_results(["global_file1", "same-value", "same-value"]) + + ####################################################### + # Tests on "Conventions" attribute. + # Note: the usual 'Conventions' behaviour is already tested elsewhere + # - see :class:`TestConventionsAttributes` above + # + # TODO: the name 'conventions' (lower-case) is also listed in _CF_GLOBAL_ATTRS, but + # we have excluded it from the global-attrs testing here. We probably still need to + # test what that does, though it's inclusion might simply be a mistake. + # + + def test_07_conventions_var_local(self): + # What happens if 'Conventions' appears as a variable-local attribute. + # N.B. this is not good CF, but we'll see what happens anyway. + self.run_load_testcase( + attr_name="Conventions", + values=[None, "user_set"], + ) + # Legacy result + self.check_load_results([None, "user_set"], oldstyle_combined=True) + # Newstyle result + self.check_load_results([None, "user_set"]) + + def test_08_conventions_var_both(self): + # What happens if 'Conventions' appears as both global + local attribute. + self.run_load_testcase( + attr_name="Conventions", + values=["global-setting", "local-setting"], + ) + # (#1): legacy result : the global version gets lost. + self.check_load_results( + [None, "local-setting"], oldstyle_combined=True + ) + # (#2): newstyle results : retain both. + self.check_load_results(["global-setting", "local-setting"]) + + ####################################################### + # Tests on "global" style attributes + # = those specific ones which 'ought' only to be global (except on collisions) + # + + def test_09_globalstyle__global(self, global_attr): + attr_content = f"Global tracked {global_attr}" + self.run_load_testcase( + attr_name=global_attr, values=[attr_content, None] + ) + # (#1) legacy + self.check_load_results([None, attr_content], oldstyle_combined=True) + # (#2) newstyle : global status preserved. + self.check_load_results([attr_content, None]) + + def test_10_globalstyle__local(self, global_attr): + # Strictly, not correct CF, but let's see what it does with it. + attr_content = f"Local tracked {global_attr}" + self.run_load_testcase( + attr_name=global_attr, + values=[None, attr_content], + ) + # (#1): legacy result = treated the same as a global setting + self.check_load_results([None, attr_content], oldstyle_combined=True) + # (#2): newstyle result : remains local + self.check_load_results( + [None, attr_content], + ) + + def test_11_globalstyle__both(self, global_attr): + attr_global = f"Global-{global_attr}" + attr_local = f"Local-{global_attr}" + self.run_load_testcase( + attr_name=global_attr, + values=[attr_global, attr_local], + ) + # (#1) legacy result : promoted local setting "wins" + self.check_load_results([None, attr_local], oldstyle_combined=True) + # (#2) newstyle result : both retained + self.check_load_results([attr_global, attr_local]) + + def test_12_globalstyle__multivar_different(self, global_attr): + # Multiple *different* local settings are retained + attr_1 = f"Local-{global_attr}-1" + attr_2 = f"Local-{global_attr}-2" + self.run_load_testcase( + attr_name=global_attr, + values=[None, attr_1, attr_2], + ) + # (#1): legacy values, for cube.attributes viewed as a single dict + self.check_load_results([None, attr_1, attr_2], oldstyle_combined=True) + # (#2): exact results, with newstyle "split" cube attrs + self.check_load_results([None, attr_1, attr_2]) + + def test_14_globalstyle__multifile_different(self, global_attr): + # Different global attributes from multiple files + attr_1 = f"Global-{global_attr}-1" + attr_2 = f"Global-{global_attr}-2" + self.run_load_testcase( + attr_name=global_attr, + values=[[attr_1, None, None], [attr_2, None, None]], + ) + # (#1) legacy : multiple globals retained as local ones + self.check_load_results( + [None, attr_1, attr_1, attr_2, attr_2], oldstyle_combined=True + ) + # (#1) newstyle : result same as input + self.check_load_results([[attr_1, None, None], [attr_2, None, None]]) + + ####################################################### + # Tests on "local" style attributes + # = those specific ones which 'ought' to appear attached to a variable, rather than + # being global + # + + @pytest.mark.parametrize("origin_style", ["input_global", "input_local"]) + def test_16_localstyle(self, local_attr, origin_style): + # local-style attributes should *not* get 'promoted' to global ones + # Set the name extension to avoid tests with different 'style' params having + # collisions over identical testfile names + self.testname_extension = origin_style + + attrval = f"Attr-setting-{local_attr}" + if local_attr == "missing_value": + # Special-case : 'missing_value' type must be compatible with the variable + attrval = 303 + elif local_attr == "ukmo__process_flags": + # Another special case : the handling of this one is "unusual". + attrval = "process" + + # Create testfiles and load them, which should always produce a single cube. + if origin_style == "input_global": + # Record in source as a global attribute + values = [attrval, None] + else: + assert origin_style == "input_local" + # Record in source as a variable-local attribute + values = [None, attrval] + + self.run_load_testcase(attr_name=local_attr, values=values) + + # Work out the expected result. + result_value = attrval + # ... there are some special cases + if origin_style == "input_local": + if local_attr == "ukmo__process_flags": + # Some odd special behaviour here. + result_value = (result_value,) + elif local_attr in ("standard_error_multiplier", "missing_value"): + # For some reason, these ones never appear on the cube + result_value = None + + # NOTE: **legacy** result is the same, whether the original attribute was + # provided as a global or local attribute ... + expected_result_legacy = [None, result_value] + + # While 'newstyle' results preserve the input type local/global. + if origin_style == "input_local": + expected_result_newstyle = [None, result_value] + else: + expected_result_newstyle = [result_value, None] + + # (#1): legacy values, for cube.attributes viewed as a single dict + self.check_load_results(expected_result_legacy, oldstyle_combined=True) + # (#2): exact results, with newstyle "split" cube attrs + self.check_load_results(expected_result_newstyle) + + @pytest.mark.parametrize("testcase", _MATRIX_TESTCASES[:max_param_attrs]) + @pytest.mark.parametrize("attrname", _MATRIX_ATTRNAMES) + @pytest.mark.parametrize("resultstyle", _MATRIX_LOAD_RESULTSTYLES) + def test_load_matrix( + self, testcase, attrname, matrix_results, resultstyle + ): + do_saves, matrix_results = matrix_results + testcase_spec = matrix_results["load"][testcase] + input_spec = testcase_spec["input"] + values = decode_matrix_input(input_spec) + + self.run_load_testcase(attrname, values) + + result_cubes = iris.load(self.input_filepaths) + do_combined = resultstyle == "legacy" + results = self.fetch_results( + cubes=result_cubes, oldstyle_combined=do_combined + ) + result_spec = encode_matrix_result(results) + + attr_style = deduce_attr_style(attrname) + expected = testcase_spec[attr_style][resultstyle] + + if do_saves: + testcase_spec[attr_style][resultstyle] = result_spec + if expected is not None: + assert result_spec == expected + + +class TestSave(MixinAttrsTesting): + """ + Test saving from cube attributes dictionary (various categories) into files. + + """ + + # Parametrise all tests over split/unsplit saving. + @pytest.fixture( + params=_SPLIT_PARAM_VALUES, ids=_SPLIT_PARAM_IDS, autouse=True + ) + def do_split(self, request): + do_split = request.param + self.save_split_attrs = do_split + return do_split + + def run_save_testcase(self, attr_name: str, values: list): + # Create input cubes. + self.run_testcase( + attr_name=attr_name, + values=values, + create_cubes_or_files="cubes", + ) + + # Save input cubes to a temporary result file. + with warnings.catch_warnings(record=True) as captured_warnings: + self.result_filepath = self._testfile_path("result") + do_split = getattr(self, "save_split_attrs", False) + kwargs = ( + dict(save_split_attrs=do_split) + if _SPLIT_SAVE_SUPPORTED + else dict() + ) + with iris.FUTURE.context(**kwargs): + iris.save(self.input_cubes, self.result_filepath) + + self.captured_warnings = captured_warnings + + def run_save_testcase_legacytype(self, attr_name: str, values: list): + """ + Legacy-type means : before cubes had split attributes. + + This just means we have only one "set" of cubes, with ***no*** distinct global + attribute. + """ + if not isinstance(values, list): + # Translate single input value to list-of-1 + values = [values] + + self.run_save_testcase(attr_name, [None] + values) + + def check_save_results( + self, expected: list, expected_warnings: List[str] = None + ): + results = self.fetch_results(filepath=self.result_filepath) + assert results == expected + check_captured_warnings( + expected_warnings, + self.captured_warnings, + # N.B. only allow a legacy-attributes warning when NOT saving split-attrs + allow_possible_legacy_warning=not self.save_split_attrs, + ) + + def test_userstyle__single(self, do_split): + self.run_save_testcase_legacytype("random", "value-x") + if do_split: + # result as input values + expected_result = [None, "value-x"] + else: + # in legacy mode, promoted = stored as a *global* by default. + expected_result = ["value-x", None] + self.check_save_results(expected_result) + + def test_userstyle__multiple_same(self, do_split): + self.run_save_testcase_legacytype("random", ["value-x", "value-x"]) + if do_split: + # result as input values + expected_result = [None, "value-x", "value-x"] + else: + # in legacy mode, promoted = stored as a *global* by default. + expected_result = ["value-x", None, None] + self.check_save_results(expected_result) + + def test_userstyle__multiple_different(self): + # Clashing values are stored as locals on the individual variables. + self.run_save_testcase_legacytype("random", ["value-A", "value-B"]) + self.check_save_results([None, "value-A", "value-B"]) + + def test_userstyle__multiple_onemissing(self): + # Multiple user-type, with one missing, behave like different values. + self.run_save_testcase_legacytype( + "random", + ["value", None], + ) + # Stored as locals when there are differing values. + self.check_save_results([None, "value", None]) + + def test_Conventions__single(self): + self.run_save_testcase_legacytype("Conventions", "x") + # Always discarded + replaced by a single global setting. + self.check_save_results(["CF-1.7", None]) + + def test_Conventions__multiple_same(self): + self.run_save_testcase_legacytype( + "Conventions", ["same-value", "same-value"] + ) + # Always discarded + replaced by a single global setting. + self.check_save_results(["CF-1.7", None, None]) + + def test_Conventions__multiple_different(self): + self.run_save_testcase_legacytype( + "Conventions", ["value-A", "value-B"] + ) + # Always discarded + replaced by a single global setting. + self.check_save_results(["CF-1.7", None, None]) + + def test_globalstyle__single(self, global_attr, do_split): + self.run_save_testcase_legacytype(global_attr, ["value"]) + if do_split: + # result as input values + expected_warning = "should only be a CF global" + expected_result = [None, "value"] + else: + # in legacy mode, promoted + expected_warning = None + expected_result = ["value", None] + self.check_save_results(expected_result, expected_warning) + + def test_globalstyle__multiple_same(self, global_attr, do_split): + # Multiple global-type with same values are made global. + self.run_save_testcase_legacytype( + global_attr, + ["value-same", "value-same"], + ) + if do_split: + # result as input values + expected_result = [None, "value-same", "value-same"] + expected_warning = "should only be a CF global attribute" + else: + # in legacy mode, promoted + expected_result = ["value-same", None, None] + expected_warning = None + self.check_save_results(expected_result, expected_warning) + + def test_globalstyle__multiple_different(self, global_attr): + # Multiple global-type with different values become local, with warning. + self.run_save_testcase_legacytype(global_attr, ["value-A", "value-B"]) + # *Only* stored as locals when there are differing values. + msg_regexp = ( + f"'{global_attr}' is being added as CF data variable attribute," + f".* should only be a CF global attribute." + ) + self.check_save_results( + [None, "value-A", "value-B"], expected_warnings=msg_regexp + ) + + def test_globalstyle__multiple_onemissing(self, global_attr): + # Multiple global-type, with one missing, behave like different values. + self.run_save_testcase_legacytype( + global_attr, ["value", "value", None] + ) + # Stored as locals when there are differing values. + msg_regexp = ( + f"'{global_attr}' is being added as CF data variable attribute," + f".* should only be a CF global attribute." + ) + self.check_save_results( + [None, "value", "value", None], expected_warnings=msg_regexp + ) + + def test_localstyle__single(self, local_attr): + self.run_save_testcase_legacytype(local_attr, ["value"]) + + # Defaults to local + expected_results = [None, "value"] + # .. but a couple of special cases + if local_attr == "ukmo__process_flags": + # A particular, really weird case + expected_results = [None, "v a l u e"] + elif local_attr == "STASH": + # A special case : the stored name is different + self.attrname = "um_stash_source" + + self.check_save_results(expected_results) + + def test_localstyle__multiple_same(self, local_attr): + self.run_save_testcase_legacytype( + local_attr, ["value-same", "value-same"] + ) + + # They remain separate + local + expected_results = [None, "value-same", "value-same"] + if local_attr == "ukmo__process_flags": + # A particular, really weird case + expected_results = [ + None, + "v a l u e - s a m e", + "v a l u e - s a m e", + ] + elif local_attr == "STASH": + # A special case : the stored name is different + self.attrname = "um_stash_source" + + self.check_save_results(expected_results) + + def test_localstyle__multiple_different(self, local_attr): + self.run_save_testcase_legacytype(local_attr, ["value-A", "value-B"]) + # Different values are treated just the same as matching ones. + expected_results = [None, "value-A", "value-B"] + if local_attr == "ukmo__process_flags": + # A particular, really weird case + expected_results = [ + None, + "v a l u e - A", + "v a l u e - B", + ] + elif local_attr == "STASH": + # A special case : the stored name is different + self.attrname = "um_stash_source" + self.check_save_results(expected_results) + + # + # Test handling of newstyle independent global+local cube attributes. + # + def test_globallocal_clashing(self, do_split): + # A cube has clashing local + global attrs. + original_values = ["valueA", "valueB"] + self.run_save_testcase("userattr", original_values) + expected_result = original_values.copy() + if not do_split: + # in legacy mode, "promote" = lose the local one + expected_result[0] = expected_result[1] + expected_result[1] = None + self.check_save_results(expected_result) + + def test_globallocal_oneeach_same(self, do_split): + # One cube with global attr, another with identical local one. + self.run_save_testcase( + "userattr", values=[[None, "value"], ["value", None]] + ) + if do_split: + expected = [None, "value", "value"] + expected_warning = ( + r"Saving the cube global attributes \['userattr'\] as local" + ) + else: + # N.B. legacy code sees only two equal values (and promotes). + expected = ["value", None, None] + expected_warning = None + + self.check_save_results(expected, expected_warning) + + def test_globallocal_oneeach_different(self, do_split): + # One cube with global attr, another with a *different* local one. + self.run_save_testcase( + "userattr", [[None, "valueA"], ["valueB", None]] + ) + if do_split: + warning = ( + r"Saving the cube global attributes \['userattr'\] as local" + ) + else: + # N.B. legacy code does not warn of global-to-local "demotion". + warning = None + self.check_save_results([None, "valueA", "valueB"], warning) + + def test_globallocal_one_other_clashingglobals(self, do_split): + # Two cubes with both, second cube has a clashing global attribute. + self.run_save_testcase( + "userattr", + values=[["valueA", "valueB"], ["valueXXX", "valueB"]], + ) + if do_split: + expected = [None, "valueB", "valueB"] + expected_warnings = [ + "Saving.* global attributes.* as local", + 'attributes.* of cube "v1" were not saved', + 'attributes.* of cube "v2" were not saved', + ] + else: + # N.B. legacy code sees only the locals, and promotes them. + expected = ["valueB", None, None] + expected_warnings = None + self.check_save_results(expected, expected_warnings) + + def test_globallocal_one_other_clashinglocals(self, do_split): + # Two cubes with both, second cube has a clashing local attribute. + inputs = [["valueA", "valueB"], ["valueA", "valueXXX"]] + if do_split: + expected = ["valueA", "valueB", "valueXXX"] + else: + # N.B. legacy code sees only the locals. + expected = [None, "valueB", "valueXXX"] + self.run_save_testcase("userattr", values=inputs) + self.check_save_results(expected) + + @pytest.mark.parametrize("testcase", _MATRIX_TESTCASES[:max_param_attrs]) + @pytest.mark.parametrize("attrname", _MATRIX_ATTRNAMES) + def test_save_matrix(self, testcase, attrname, matrix_results, do_split): + do_saves, matrix_results = matrix_results + split_param = "split" if do_split else "unsplit" + testcase_spec = matrix_results["save"][testcase] + input_spec = testcase_spec["input"] + values = decode_matrix_input(input_spec) + + self.run_save_testcase(attrname, values) + results = self.fetch_results(filepath=self.result_filepath) + result_spec = encode_matrix_result(results) + + attr_style = deduce_attr_style(attrname) + expected = testcase_spec[attr_style][split_param] + + if do_saves: + testcase_spec[attr_style][split_param] = result_spec + if expected is not None: + assert result_spec == expected diff --git a/lib/iris/tests/test_merge.py b/lib/iris/tests/test_merge.py index 054fd3a20b..7c11fde55d 100644 --- a/lib/iris/tests/test_merge.py +++ b/lib/iris/tests/test_merge.py @@ -21,6 +21,7 @@ from iris._lazy_data import as_lazy_data from iris.coords import AuxCoord, DimCoord import iris.cube +from iris.cube import CubeAttrsDict import iris.exceptions import iris.tests.stock @@ -1107,5 +1108,86 @@ def test_ancillary_variable_error_msg(self): _ = iris.cube.CubeList([cube1, cube2]).merge_cube() +class TestCubeMerge__split_attributes__error_messages(tests.IrisTest): + """ + Specific tests for the detection and wording of attribute-mismatch errors. + + In particular, the adoption of 'split' attributes with the new + :class:`iris.cube.CubeAttrsDict` introduces some more subtle possible discrepancies + in attributes, where this has also impacted the messaging, so this aims to probe + those cases. + """ + + def _check_merge_error(self, attrs_1, attrs_2, expected_message): + """ + Check the error from a merge failure caused by a mismatch of attributes. + + Build a pair of cubes with given attributes, merge them + check for a match + to the expected error message. + """ + cube_1 = iris.cube.Cube( + [0], + aux_coords_and_dims=[(AuxCoord([1], long_name="x"), None)], + attributes=attrs_1, + ) + cube_2 = iris.cube.Cube( + [0], + aux_coords_and_dims=[(AuxCoord([2], long_name="x"), None)], + attributes=attrs_2, + ) + with self.assertRaisesRegex( + iris.exceptions.MergeError, expected_message + ): + iris.cube.CubeList([cube_1, cube_2]).merge_cube() + + def test_keys_differ__single(self): + self._check_merge_error( + attrs_1=dict(a=1, b=2), + attrs_2=dict(a=1), + # Note: matching key 'a' does *not* appear in the message + expected_message="cube.attributes keys differ: 'b'", + ) + + def test_keys_differ__multiple(self): + self._check_merge_error( + attrs_1=dict(a=1, b=2), + attrs_2=dict(a=1, c=2), + expected_message="cube.attributes keys differ: 'b', 'c'", + ) + + def test_values_differ__single(self): + self._check_merge_error( + attrs_1=dict(a=1, b=2), # Note: matching key 'a' does not appear + attrs_2=dict(a=1, b=3), + expected_message="cube.attributes values differ for keys: 'b'", + ) + + def test_values_differ__multiple(self): + self._check_merge_error( + attrs_1=dict(a=1, b=2), + attrs_2=dict(a=12, b=22), + expected_message="cube.attributes values differ for keys: 'a', 'b'", + ) + + def test_splitattrs_keys_local_global_mismatch(self): + # Since Cube.attributes is now a "split-attributes" dictionary, it is now + # possible to have "cube1.attributes != cube1.attributes", but also + # "set(cube1.attributes.keys()) == set(cube2.attributes.keys())". + # I.E. it is now necessary to specifically compare ".globals" and ".locals" to + # see *what* differs between two attributes dictionaries. + self._check_merge_error( + attrs_1=CubeAttrsDict(globals=dict(a=1), locals=dict(b=2)), + attrs_2=CubeAttrsDict(locals=dict(a=2)), + expected_message="cube.attributes keys differ: 'a', 'b'", + ) + + def test_splitattrs_keys_local_match_masks_global_mismatch(self): + self._check_merge_error( + attrs_1=CubeAttrsDict(globals=dict(a=1), locals=dict(a=3)), + attrs_2=CubeAttrsDict(globals=dict(a=2), locals=dict(a=3)), + expected_message="cube.attributes values differ for keys: 'a'", + ) + + if __name__ == "__main__": tests.main() diff --git a/lib/iris/tests/unit/common/metadata/test_CubeMetadata.py b/lib/iris/tests/unit/common/metadata/test_CubeMetadata.py index 382607dca5..4425ba62d7 100644 --- a/lib/iris/tests/unit/common/metadata/test_CubeMetadata.py +++ b/lib/iris/tests/unit/common/metadata/test_CubeMetadata.py @@ -15,8 +15,11 @@ import unittest.mock as mock from unittest.mock import sentinel +import pytest + from iris.common.lenient import _LENIENT, _qualname from iris.common.metadata import BaseMetadata, CubeMetadata +from iris.cube import CubeAttrsDict def _make_metadata( @@ -90,9 +93,360 @@ def test_bases(self): self.assertTrue(issubclass(self.cls, BaseMetadata)) -class Test___eq__(tests.IrisTest): - def setUp(self): - self.values = dict( +@pytest.fixture(params=CubeMetadata._fields) +def fieldname(request): + """Parametrize testing over all CubeMetadata field names.""" + return request.param + + +@pytest.fixture(params=["strict", "lenient"]) +def op_leniency(request): + """Parametrize testing over strict or lenient operation.""" + return request.param + + +@pytest.fixture(params=["primaryAA", "primaryAX", "primaryAB"]) +def primary_values(request): + """ + Parametrize over the possible non-trivial pairs of operation values. + + The parameters all provide two attribute values which are the left- and right-hand + arguments to the tested operation. The attribute values are single characters from + the end of the parameter name -- except that "X" denotes a "missing" attribute. + + The possible cases are: + + * one side has a value and the other is missing + * left and right have the same non-missing value + * left and right have different non-missing values + """ + return request.param + + +@pytest.fixture(params=[False, True], ids=["primaryLocal", "primaryGlobal"]) +def primary_is_global_not_local(request): + """Parametrize split-attribute testing over "global" or "local" attribute types.""" + return request.param + + +@pytest.fixture(params=[False, True], ids=["leftrightL2R", "leftrightR2L"]) +def order_reversed(request): + """Parametrize split-attribute testing over "left OP right" or "right OP left".""" + return request.param + + +# Define the expected results for split-attribute testing. +# This dictionary records the expected results for the various possible arrangements of +# values of a single attribute in the "left" and "right" inputs of a CubeMetadata +# operation. +# The possible operations are "equal", "combine" or "difference", and may all be +# performed "strict" or "lenient". +# N.B. the *same* results should also apply when left+right are swapped, with a suitable +# adjustment to the result value. Likewise, results should be the same for either +# global- or local-style attributes. +_ALL_RESULTS = { + "equal": { + "primaryAA": {"lenient": True, "strict": True}, + "primaryAX": {"lenient": True, "strict": False}, + "primaryAB": {"lenient": False, "strict": False}, + }, + "combine": { + "primaryAA": {"lenient": "A", "strict": "A"}, + "primaryAX": {"lenient": "A", "strict": None}, + "primaryAB": {"lenient": None, "strict": None}, + }, + "difference": { + "primaryAA": {"lenient": None, "strict": None}, + "primaryAX": {"lenient": None, "strict": ("A", None)}, + "primaryAB": {"lenient": ("A", "B"), "strict": ("A", "B")}, + }, +} +# A fixed attribute name used for all the split-attribute testing. +_TEST_ATTRNAME = "_test_attr_" + + +def extract_attribute_value(split_dict, extract_global): + """ + Extract a test-attribute value from a split-attribute dictionary. + + Parameters + ---------- + split_dict : CubeAttrsDict + a split dictionary from an operation result + extract_global : bool + whether to extract values of the global, or local, `_TEST_ATTRNAME` attribute + + Returns + ------- + str | None + """ + if extract_global: + result = split_dict.globals.get(_TEST_ATTRNAME, None) + else: + result = split_dict.locals.get(_TEST_ATTRNAME, None) + return result + + +def extract_result_value(input, extract_global): + """ + Extract the values(s) of the main test attribute from an operation result. + + Parameters + ---------- + input : bool | CubeMetadata + an operation result : the structure varies for the three different operations. + extract_global : bool + whether to return values of a global, or local, `_TEST_ATTRNAME` attribute. + + Returns + ------- + None | bool | str | tuple[None | str] + result value(s) + """ + if not isinstance(input, CubeMetadata): + # Result is either boolean (for "equals") or a None (for "difference"). + result = input + else: + # Result is a CubeMetadata. Get the value(s) of the required attribute. + result = input.attributes + + if isinstance(result, CubeAttrsDict): + result = extract_attribute_value(result, extract_global) + else: + # For "difference", input.attributes is a *pair* of dictionaries. + assert isinstance(result, tuple) + result = tuple( + [ + extract_attribute_value(dic, extract_global) + for dic in result + ] + ) + if result == (None, None): + # This value occurs when the desired attribute is *missing* from a + # difference result, but other (secondary) attributes were *different*. + # We want only differences of the *target* attribute, so convert these + # to a plain 'no difference', for expected-result testing purposes. + result = None + + return result + + +def make_attrsdict(value): + """ + Return a dictionary containing a test attribute with the given value. + + If the value is "X", the attribute is absent (result is empty dict). + """ + if value == "X": + # Translate an "X" input as "missing". + result = {} + else: + result = {_TEST_ATTRNAME: value} + return result + + +def check_splitattrs_testcase( + operation_name: str, + check_is_lenient: bool, + primary_inputs: str = "AA", # two character values + secondary_inputs: str = "XX", # two character values + check_global_not_local: bool = True, + check_reversed: bool = False, +): + """ + Test a metadata operation with split-attributes against known expected results. + + Parameters + ---------- + operation_name : str + One of "equal", "combine" or "difference. + check_is_lenient : bool + Whether the tested operation is performed 'lenient' or 'strict'. + primary_inputs : str + A pair of characters defining left + right attribute values for the operands of + the operation. + secondary_inputs : str + A further pair of values for an attribute of the same name but "other" type + ( i.e. global/local when the main test is local/global ). + check_global_not_local : bool + If `True` then the primary operands, and the tested result values, are *global* + attributes, and the secondary ones are local. + Otherwise, the other way around. + check_reversed : bool + If True, the left and right operands are exchanged, and the expected value + modified according. + + Notes + ----- + The expected result of an operation is mostly defined by : the operation applied; + the main "primary" inputs; and the lenient/strict mode. + + In the case of the "equals" operation, however, the expected result is simply + set to `False` if the secondary inputs do not match. + + Calling with different values for the keywords aims to show that the main operation + has the expected value, from _ALL_RESULTS, the ***same in essentially all cases*** + ( though modified in specific ways for some factors ). + + This regularity also demonstrates the required independence over the other + test-factors, i.e. global/local attribute type, and right-left order. + """ + # Just for comfort, check that inputs are all one of a few single characters. + assert all( + (item in list("ABCDX")) for item in (primary_inputs + secondary_inputs) + ) + # Interpret "primary" and "secondary" inputs as "global" and "local" attributes. + if check_global_not_local: + global_values, local_values = primary_inputs, secondary_inputs + else: + local_values, global_values = primary_inputs, secondary_inputs + + # Form 2 inputs to the operation : Make left+right split-attribute input + # dictionaries, with both the primary and secondary attribute value settings. + input_dicts = [ + CubeAttrsDict( + globals=make_attrsdict(global_value), + locals=make_attrsdict(local_value), + ) + for global_value, local_value in zip(global_values, local_values) + ] + # Make left+right CubeMetadata with those attributes, other fields all blank. + input_l, input_r = [ + CubeMetadata( + **{ + field: attrs if field == "attributes" else None + for field in CubeMetadata._fields + } + ) + for attrs in input_dicts + ] + + if check_reversed: + # Swap the inputs to perform a 'reversed' calculation. + input_l, input_r = input_r, input_l + + # Run the actual operation + result = getattr(input_l, operation_name)( + input_r, lenient=check_is_lenient + ) + + if operation_name == "difference" and check_reversed: + # Adjust the result of a "reversed" operation to the 'normal' way round. + # ( N.B. only "difference" results are affected by reversal. ) + if isinstance(result, CubeMetadata): + result = result._replace(attributes=result.attributes[::-1]) + + # Extract, from the operation result, the value to be tested against "expected". + result = extract_result_value(result, check_global_not_local) + + # Get the *expected* result for this operation. + which = "lenient" if check_is_lenient else "strict" + primary_key = "primary" + primary_inputs + expected = _ALL_RESULTS[operation_name][primary_key][which] + if operation_name == "equal" and expected: + # Account for the equality cases made `False` by mismatched secondary values. + left, right = secondary_inputs + secondaries_same = left == right or ( + check_is_lenient and "X" in (left, right) + ) + if not secondaries_same: + expected = False + + # Check that actual extracted operation result matches the "expected" one. + assert result == expected + + +class MixinSplitattrsMatrixTests: + """ + Define split-attributes tests to perform on all the metadata operations. + + This is inherited by the testclass for each operation : + i.e. Test___eq__, Test_combine and Test_difference + """ + + # Define the operation name : set in each inheritor + operation_name = None + + def test_splitattrs_cases( + self, + op_leniency, + primary_values, + primary_is_global_not_local, + order_reversed, + ): + """ + Check the basic operation against the expected result from _ALL_RESULTS. + + Parametrisation checks this for all combinations of various factors : + + * possible arrangements of the primary values + * strict and lenient + * global- and local-type attributes + * left-to-right or right-to-left operation order. + """ + primary_inputs = primary_values[-2:] + check_is_lenient = {"strict": False, "lenient": True}[op_leniency] + check_splitattrs_testcase( + operation_name=self.operation_name, + check_is_lenient=check_is_lenient, + primary_inputs=primary_inputs, + secondary_inputs="XX", + check_global_not_local=primary_is_global_not_local, + check_reversed=order_reversed, + ) + + @pytest.mark.parametrize( + "secondary_values", + [ + "secondaryXX", + "secondaryCX", + "secondaryXC", + "secondaryCC", + "secondaryCD", + ] + # NOTE: test CX as well as XC, since primary choices has "AX" but not "XA". + ) + def test_splitattrs_global_local_independence( + self, + op_leniency, + primary_values, + secondary_values, + ): + """ + Check that results are (mostly) independent of the "other" type attributes. + + The operation on attributes of the 'primary' type (global/local) should be + basically unaffected by those of the 'secondary' type (--> local/global). + + This is not really true for equality, so we adjust those results to compensate. + See :func:`check_splitattrs_testcase` for explanations. + + Notes + ----- + We provide this *separate* test for global/local attribute independence, + parametrized over selected relevant arrangements of the 'secondary' values. + We *don't* test with reversed order or "local" primary inputs, because matrix + testing over *all* relevant factors produces too many possible combinations. + """ + primary_inputs = primary_values[-2:] + secondary_inputs = secondary_values[-2:] + check_is_lenient = {"strict": False, "lenient": True}[op_leniency] + check_splitattrs_testcase( + operation_name=self.operation_name, + check_is_lenient=check_is_lenient, + primary_inputs=primary_inputs, + secondary_inputs=secondary_inputs, + check_global_not_local=True, + check_reversed=False, + ) + + +class Test___eq__(MixinSplitattrsMatrixTests): + operation_name = "equal" + + @pytest.fixture(autouse=True) + def setup(self): + self.lvalues = dict( standard_name=sentinel.standard_name, long_name=sentinel.long_name, var_name=sentinel.var_name, @@ -101,17 +455,19 @@ def setUp(self): attributes=dict(), cell_methods=sentinel.cell_methods, ) + # Setup another values tuple with all-distinct content objects. + self.rvalues = deepcopy(self.lvalues) self.dummy = sentinel.dummy self.cls = CubeMetadata def test_wraps_docstring(self): - self.assertEqual(BaseMetadata.__eq__.__doc__, self.cls.__eq__.__doc__) + assert self.cls.__eq__.__doc__ == BaseMetadata.__eq__.__doc__ def test_lenient_service(self): qualname___eq__ = _qualname(self.cls.__eq__) - self.assertIn(qualname___eq__, _LENIENT) - self.assertTrue(_LENIENT[qualname___eq__]) - self.assertTrue(_LENIENT[self.cls.__eq__]) + assert qualname___eq__ in _LENIENT + assert _LENIENT[qualname___eq__] + assert _LENIENT[self.cls.__eq__] def test_call(self): other = sentinel.other @@ -122,107 +478,114 @@ def test_call(self): ) as mocker: result = metadata.__eq__(other) - self.assertEqual(return_value, result) - self.assertEqual(1, mocker.call_count) - (arg,), kwargs = mocker.call_args - self.assertEqual(other, arg) - self.assertEqual(dict(), kwargs) - - def test_op_lenient_same(self): - lmetadata = self.cls(**self.values) - rmetadata = self.cls(**self.values) - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertTrue(lmetadata.__eq__(rmetadata)) - self.assertTrue(rmetadata.__eq__(lmetadata)) - - def test_op_lenient_same_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["var_name"] = None - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertTrue(lmetadata.__eq__(rmetadata)) - self.assertTrue(rmetadata.__eq__(lmetadata)) - - def test_op_lenient_same_cell_methods_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = None - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertFalse(lmetadata.__eq__(rmetadata)) - self.assertFalse(rmetadata.__eq__(lmetadata)) - - def test_op_lenient_different(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["units"] = self.dummy - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertFalse(lmetadata.__eq__(rmetadata)) - self.assertFalse(rmetadata.__eq__(lmetadata)) - - def test_op_lenient_different_cell_methods(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = self.dummy - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertFalse(lmetadata.__eq__(rmetadata)) - self.assertFalse(rmetadata.__eq__(lmetadata)) - - def test_op_strict_same(self): - lmetadata = self.cls(**self.values) - rmetadata = self.cls(**self.values) - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertTrue(lmetadata.__eq__(rmetadata)) - self.assertTrue(rmetadata.__eq__(lmetadata)) - - def test_op_strict_different(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["long_name"] = self.dummy - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertFalse(lmetadata.__eq__(rmetadata)) - self.assertFalse(rmetadata.__eq__(lmetadata)) - - def test_op_strict_different_cell_methods(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = self.dummy - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertFalse(lmetadata.__eq__(rmetadata)) - self.assertFalse(rmetadata.__eq__(lmetadata)) - - def test_op_strict_different_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["long_name"] = None - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertFalse(lmetadata.__eq__(rmetadata)) - self.assertFalse(rmetadata.__eq__(lmetadata)) - - def test_op_strict_different_measure_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = None - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertFalse(lmetadata.__eq__(rmetadata)) - self.assertFalse(rmetadata.__eq__(lmetadata)) + assert return_value == result + assert mocker.call_args_list == [mock.call(other)] + + def test_op_same(self, op_leniency): + # Check op all-same content, but all-new data. + # NOTE: test for both strict/lenient, should both work the same. + is_lenient = op_leniency == "lenient" + lmetadata = self.cls(**self.lvalues) + rmetadata = self.cls(**self.rvalues) + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check equality both l==r and r==l. + assert lmetadata.__eq__(rmetadata) + assert rmetadata.__eq__(lmetadata) + + def test_op_different__none(self, fieldname, op_leniency): + # One side has field=value, and the other field=None, both strict + lenient. + if fieldname == "attributes": + # Must be a dict, cannot be None. + pytest.skip() + else: + is_lenient = op_leniency == "lenient" + lmetadata = self.cls(**self.lvalues) + self.rvalues.update({fieldname: None}) + rmetadata = self.cls(**self.rvalues) + if fieldname in ("cell_methods", "standard_name", "units"): + # These ones are compared strictly + expect_success = False + elif fieldname in ("var_name", "long_name"): + # For other 'normal' fields : lenient succeeds, strict does not. + expect_success = is_lenient + else: + # Ensure we are handling all the different field cases + raise ValueError( + f"{self.__name__} unhandled fieldname : {fieldname}" + ) + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check equality both l==r and r==l. + assert lmetadata.__eq__(rmetadata) == expect_success + assert rmetadata.__eq__(lmetadata) == expect_success + + def test_op_different__value(self, fieldname, op_leniency): + # Compare when a given field value is changed, both strict + lenient. + if fieldname == "attributes": + # Dicts have more possibilities: handled separately. + pytest.skip() + else: + is_lenient = op_leniency == "lenient" + lmetadata = self.cls(**self.lvalues) + self.rvalues.update({fieldname: self.dummy}) + rmetadata = self.cls(**self.rvalues) + if fieldname in ( + "cell_methods", + "standard_name", + "units", + "long_name", + ): + # These ones are compared strictly + expect_success = False + elif fieldname == "var_name": + # For other 'normal' fields : lenient succeeds, strict does not. + expect_success = is_lenient + else: + # Ensure we are handling all the different field cases + raise ValueError( + f"{self.__name__} unhandled fieldname : {fieldname}" + ) + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check equality both l==r and r==l. + assert lmetadata.__eq__(rmetadata) == expect_success + assert rmetadata.__eq__(lmetadata) == expect_success + + def test_op_different__attribute_extra(self, op_leniency): + # Check when one set of attributes has an extra entry. + is_lenient = op_leniency == "lenient" + lmetadata = self.cls(**self.lvalues) + self.rvalues["attributes"]["_extra_"] = 1 + rmetadata = self.cls(**self.rvalues) + # This counts as equal *only* in the lenient case. + expect_success = is_lenient + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check equality both l==r and r==l. + assert lmetadata.__eq__(rmetadata) == expect_success + assert rmetadata.__eq__(lmetadata) == expect_success + + def test_op_different__attribute_value(self, op_leniency): + # lhs and rhs have different values for an attribute, both strict + lenient. + is_lenient = op_leniency == "lenient" + self.lvalues["attributes"]["_extra_"] = mock.sentinel.value1 + self.rvalues["attributes"]["_extra_"] = mock.sentinel.value2 + lmetadata = self.cls(**self.lvalues) + rmetadata = self.cls(**self.rvalues) + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # This should ALWAYS fail. + assert not lmetadata.__eq__(rmetadata) + assert not rmetadata.__eq__(lmetadata) class Test___lt__(tests.IrisTest): @@ -256,9 +619,12 @@ def test__ignore_attributes_cell_methods(self): self.assertFalse(result) -class Test_combine(tests.IrisTest): - def setUp(self): - self.values = dict( +class Test_combine(MixinSplitattrsMatrixTests): + operation_name = "combine" + + @pytest.fixture(autouse=True) + def setup(self): + self.lvalues = dict( standard_name=sentinel.standard_name, long_name=sentinel.long_name, var_name=sentinel.var_name, @@ -266,20 +632,20 @@ def setUp(self): attributes=sentinel.attributes, cell_methods=sentinel.cell_methods, ) + # Get a second copy with all-new objects. + self.rvalues = deepcopy(self.lvalues) self.dummy = sentinel.dummy self.cls = CubeMetadata self.none = self.cls(*(None,) * len(self.cls._fields)) def test_wraps_docstring(self): - self.assertEqual( - BaseMetadata.combine.__doc__, self.cls.combine.__doc__ - ) + assert self.cls.combine.__doc__ == BaseMetadata.combine.__doc__ def test_lenient_service(self): qualname_combine = _qualname(self.cls.combine) - self.assertIn(qualname_combine, _LENIENT) - self.assertTrue(_LENIENT[qualname_combine]) - self.assertTrue(_LENIENT[self.cls.combine]) + assert qualname_combine in _LENIENT + assert _LENIENT[qualname_combine] + assert _LENIENT[self.cls.combine] def test_lenient_default(self): other = sentinel.other @@ -289,11 +655,8 @@ def test_lenient_default(self): ) as mocker: result = self.none.combine(other) - self.assertEqual(return_value, result) - self.assertEqual(1, mocker.call_count) - (arg,), kwargs = mocker.call_args - self.assertEqual(other, arg) - self.assertEqual(dict(lenient=None), kwargs) + assert return_value == result + assert mocker.call_args_list == [mock.call(other, lenient=None)] def test_lenient(self): other = sentinel.other @@ -304,149 +667,165 @@ def test_lenient(self): ) as mocker: result = self.none.combine(other, lenient=lenient) - self.assertEqual(return_value, result) - self.assertEqual(1, mocker.call_count) - (arg,), kwargs = mocker.call_args - self.assertEqual(other, arg) - self.assertEqual(dict(lenient=lenient), kwargs) + assert return_value == result + assert mocker.call_args_list == [mock.call(other, lenient=lenient)] + + def test_op_same(self, op_leniency): + # Result is same as either input, both strict + lenient. + is_lenient = op_leniency == "lenient" + lmetadata = self.cls(**self.lvalues) + rmetadata = self.cls(**self.rvalues) + expected = self.lvalues + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check both l+r and r+l + assert lmetadata.combine(rmetadata)._asdict() == expected + assert rmetadata.combine(lmetadata)._asdict() == expected + + def test_op_different__none(self, fieldname, op_leniency): + # One side has field=value, and the other field=None, both strict + lenient. + if fieldname == "attributes": + # Can't be None : Tested separately + pytest.skip() + + is_lenient = op_leniency == "lenient" + + lmetadata = self.cls(**self.lvalues) + # Cancel one setting in the rhs argument. + self.rvalues[fieldname] = None + rmetadata = self.cls(**self.rvalues) + + if fieldname in ("cell_methods", "units"): + # NB cell-methods and units *always* strict behaviour. + # strict form : take only those which both have set + strict_result = True + elif fieldname in ("standard_name", "long_name", "var_name"): + strict_result = not is_lenient + else: + # Ensure we are handling all the different field cases + raise ValueError( + f"{self.__name__} unhandled fieldname : {fieldname}" + ) - def test_op_lenient_same(self): - lmetadata = self.cls(**self.values) - rmetadata = self.cls(**self.values) - expected = self.values - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_lenient_same_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["var_name"] = None - rmetadata = self.cls(**right) - expected = self.values - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_lenient_same_cell_methods_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = None - rmetadata = self.cls(**right) - expected = right.copy() - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_lenient_different(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["units"] = self.dummy - rmetadata = self.cls(**right) - expected = self.values.copy() - expected["units"] = None - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_lenient_different_cell_methods(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = self.dummy - rmetadata = self.cls(**right) - expected = self.values.copy() - expected["cell_methods"] = None - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_strict_same(self): - lmetadata = self.cls(**self.values) - rmetadata = self.cls(**self.values) - expected = self.values.copy() - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_strict_different(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["long_name"] = self.dummy - rmetadata = self.cls(**right) - expected = self.values.copy() - expected["long_name"] = None - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_strict_different_cell_methods(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = self.dummy - rmetadata = self.cls(**right) - expected = self.values.copy() - expected["cell_methods"] = None - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_strict_different_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["long_name"] = None - rmetadata = self.cls(**right) - expected = self.values.copy() - expected["long_name"] = None - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - def test_op_strict_different_cell_methods_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = None - rmetadata = self.cls(**right) - expected = self.values.copy() - expected["cell_methods"] = None - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual(expected, lmetadata.combine(rmetadata)._asdict()) - self.assertEqual(expected, rmetadata.combine(lmetadata)._asdict()) - - -class Test_difference(tests.IrisTest): - def setUp(self): - self.values = dict( + if strict_result: + # include only those which both have + expected = self.rvalues + else: + # also include those which only 1 has + expected = self.lvalues + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check both l+r and r+l + assert lmetadata.combine(rmetadata)._asdict() == expected + assert rmetadata.combine(lmetadata)._asdict() == expected + + def test_op_different__value(self, fieldname, op_leniency): + # One field has different value for lhs/rhs, both strict + lenient. + if fieldname == "attributes": + # Attribute behaviours are tested separately + pytest.skip() + + is_lenient = op_leniency == "lenient" + + self.lvalues[fieldname] = mock.sentinel.value1 + self.rvalues[fieldname] = mock.sentinel.value2 + lmetadata = self.cls(**self.lvalues) + rmetadata = self.cls(**self.rvalues) + + # In all cases, this field should be None in the result : leniency has no effect + expected = self.lvalues.copy() + expected[fieldname] = None + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check both l+r and r+l + assert lmetadata.combine(rmetadata)._asdict() == expected + assert rmetadata.combine(lmetadata)._asdict() == expected + + def test_op_different__attribute_extra(self, op_leniency): + # One field has an extra attribute, both strict + lenient. + is_lenient = op_leniency == "lenient" + + self.lvalues["attributes"] = {"_a_common_": mock.sentinel.dummy} + self.rvalues["attributes"] = self.lvalues["attributes"].copy() + self.rvalues["attributes"]["_extra_"] = mock.sentinel.testvalue + lmetadata = self.cls(**self.lvalues) + rmetadata = self.cls(**self.rvalues) + + if is_lenient: + # the extra attribute should appear in the result .. + expected = self.rvalues + else: + # .. it should not + expected = self.lvalues + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check both l+r and r+l + assert lmetadata.combine(rmetadata)._asdict() == expected + assert rmetadata.combine(lmetadata)._asdict() == expected + + def test_op_different__attribute_value(self, op_leniency): + # lhs and rhs have different values for an attribute, both strict + lenient. + is_lenient = op_leniency == "lenient" + + self.lvalues["attributes"] = { + "_a_common_": self.dummy, + "_b_common_": mock.sentinel.value1, + } + self.lvalues["attributes"] = { + "_a_common_": self.dummy, + "_b_common_": mock.sentinel.value2, + } + lmetadata = self.cls(**self.lvalues) + rmetadata = self.cls(**self.rvalues) + + # Result has entirely EMPTY attributes (whether strict or lenient). + # TODO: is this maybe a mistake of the existing implementation ? + expected = self.lvalues.copy() + expected["attributes"] = None + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # Check both l+r and r+l + assert lmetadata.combine(rmetadata)._asdict() == expected + assert rmetadata.combine(lmetadata)._asdict() == expected + + +class Test_difference(MixinSplitattrsMatrixTests): + operation_name = "difference" + + @pytest.fixture(autouse=True) + def setup(self): + self.lvalues = dict( standard_name=sentinel.standard_name, long_name=sentinel.long_name, var_name=sentinel.var_name, units=sentinel.units, - attributes=sentinel.attributes, + attributes=dict(), # MUST be a dict cell_methods=sentinel.cell_methods, ) + # Make a copy with all-different objects in it. + self.rvalues = deepcopy(self.lvalues) self.dummy = sentinel.dummy self.cls = CubeMetadata self.none = self.cls(*(None,) * len(self.cls._fields)) def test_wraps_docstring(self): - self.assertEqual( - BaseMetadata.difference.__doc__, self.cls.difference.__doc__ - ) + assert self.cls.difference.__doc__ == BaseMetadata.difference.__doc__ def test_lenient_service(self): qualname_difference = _qualname(self.cls.difference) - self.assertIn(qualname_difference, _LENIENT) - self.assertTrue(_LENIENT[qualname_difference]) - self.assertTrue(_LENIENT[self.cls.difference]) + assert qualname_difference in _LENIENT + assert _LENIENT[qualname_difference] + assert _LENIENT[self.cls.difference] def test_lenient_default(self): other = sentinel.other @@ -456,11 +835,8 @@ def test_lenient_default(self): ) as mocker: result = self.none.difference(other) - self.assertEqual(return_value, result) - self.assertEqual(1, mocker.call_count) - (arg,), kwargs = mocker.call_args - self.assertEqual(other, arg) - self.assertEqual(dict(lenient=None), kwargs) + assert return_value == result + assert mocker.call_args_list == [mock.call(other, lenient=None)] def test_lenient(self): other = sentinel.other @@ -471,178 +847,149 @@ def test_lenient(self): ) as mocker: result = self.none.difference(other, lenient=lenient) - self.assertEqual(return_value, result) - self.assertEqual(1, mocker.call_count) - (arg,), kwargs = mocker.call_args - self.assertEqual(other, arg) - self.assertEqual(dict(lenient=lenient), kwargs) - - def test_op_lenient_same(self): - lmetadata = self.cls(**self.values) - rmetadata = self.cls(**self.values) - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertIsNone(lmetadata.difference(rmetadata)) - self.assertIsNone(rmetadata.difference(lmetadata)) - - def test_op_lenient_same_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["var_name"] = None - rmetadata = self.cls(**right) - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertIsNone(lmetadata.difference(rmetadata)) - self.assertIsNone(rmetadata.difference(lmetadata)) - - def test_op_lenient_same_cell_methods_none(self): - lmetadata = self.cls(**self.values) - right = self.values.copy() - right["cell_methods"] = None - rmetadata = self.cls(**right) - lexpected = deepcopy(self.none)._asdict() - lexpected["cell_methods"] = (sentinel.cell_methods, None) - rexpected = deepcopy(self.none)._asdict() - rexpected["cell_methods"] = (None, sentinel.cell_methods) - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertEqual( - lexpected, lmetadata.difference(rmetadata)._asdict() - ) - self.assertEqual( - rexpected, rmetadata.difference(lmetadata)._asdict() - ) - - def test_op_lenient_different(self): - left = self.values.copy() - lmetadata = self.cls(**left) - right = self.values.copy() - right["units"] = self.dummy - rmetadata = self.cls(**right) - lexpected = deepcopy(self.none)._asdict() - lexpected["units"] = (left["units"], right["units"]) - rexpected = deepcopy(self.none)._asdict() - rexpected["units"] = lexpected["units"][::-1] - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertEqual( - lexpected, lmetadata.difference(rmetadata)._asdict() - ) - self.assertEqual( - rexpected, rmetadata.difference(lmetadata)._asdict() - ) - - def test_op_lenient_different_cell_methods(self): - left = self.values.copy() - lmetadata = self.cls(**left) - right = self.values.copy() - right["cell_methods"] = self.dummy - rmetadata = self.cls(**right) - lexpected = deepcopy(self.none)._asdict() - lexpected["cell_methods"] = ( - left["cell_methods"], - right["cell_methods"], - ) - rexpected = deepcopy(self.none)._asdict() - rexpected["cell_methods"] = lexpected["cell_methods"][::-1] - - with mock.patch("iris.common.metadata._LENIENT", return_value=True): - self.assertEqual( - lexpected, lmetadata.difference(rmetadata)._asdict() - ) - self.assertEqual( - rexpected, rmetadata.difference(lmetadata)._asdict() - ) - - def test_op_strict_same(self): - lmetadata = self.cls(**self.values) - rmetadata = self.cls(**self.values) - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertIsNone(lmetadata.difference(rmetadata)) - self.assertIsNone(rmetadata.difference(lmetadata)) - - def test_op_strict_different(self): - left = self.values.copy() - lmetadata = self.cls(**left) - right = self.values.copy() - right["long_name"] = self.dummy - rmetadata = self.cls(**right) - lexpected = deepcopy(self.none)._asdict() - lexpected["long_name"] = (left["long_name"], right["long_name"]) - rexpected = deepcopy(self.none)._asdict() - rexpected["long_name"] = lexpected["long_name"][::-1] - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual( - lexpected, lmetadata.difference(rmetadata)._asdict() - ) - self.assertEqual( - rexpected, rmetadata.difference(lmetadata)._asdict() - ) - - def test_op_strict_different_cell_methods(self): - left = self.values.copy() - lmetadata = self.cls(**left) - right = self.values.copy() - right["cell_methods"] = self.dummy - rmetadata = self.cls(**right) - lexpected = deepcopy(self.none)._asdict() - lexpected["cell_methods"] = ( - left["cell_methods"], - right["cell_methods"], - ) - rexpected = deepcopy(self.none)._asdict() - rexpected["cell_methods"] = lexpected["cell_methods"][::-1] - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual( - lexpected, lmetadata.difference(rmetadata)._asdict() - ) - self.assertEqual( - rexpected, rmetadata.difference(lmetadata)._asdict() + assert return_value == result + assert mocker.call_args_list == [mock.call(other, lenient=lenient)] + + def test_op_same(self, op_leniency): + is_lenient = op_leniency == "lenient" + lmetadata = self.cls(**self.lvalues) + rmetadata = self.cls(**self.rvalues) + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + assert lmetadata.difference(rmetadata) is None + assert rmetadata.difference(lmetadata) is None + + def test_op_different__none(self, fieldname, op_leniency): + # One side has field=value, and the other field=None, both strict + lenient. + if fieldname in ("attributes",): + # These cannot properly be set to 'None'. Tested elsewhere. + pytest.skip() + + is_lenient = op_leniency == "lenient" + + lmetadata = self.cls(**self.lvalues) + self.rvalues[fieldname] = None + rmetadata = self.cls(**self.rvalues) + + if fieldname in ("units", "cell_methods"): + # These ones are always "strict" + strict_result = True + elif fieldname in ("standard_name", "long_name", "var_name"): + strict_result = not is_lenient + else: + # Ensure we are handling all the different field cases + raise ValueError( + f"{self.__name__} unhandled fieldname : {fieldname}" ) - def test_op_strict_different_none(self): - left = self.values.copy() - lmetadata = self.cls(**left) - right = self.values.copy() - right["long_name"] = None - rmetadata = self.cls(**right) - lexpected = deepcopy(self.none)._asdict() - lexpected["long_name"] = (left["long_name"], right["long_name"]) - rexpected = deepcopy(self.none)._asdict() - rexpected["long_name"] = lexpected["long_name"][::-1] - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual( - lexpected, lmetadata.difference(rmetadata)._asdict() + if strict_result: + diffentry = tuple( + [getattr(mm, fieldname) for mm in (lmetadata, rmetadata)] ) - self.assertEqual( - rexpected, rmetadata.difference(lmetadata)._asdict() - ) - - def test_op_strict_different_measure_none(self): - left = self.values.copy() - lmetadata = self.cls(**left) - right = self.values.copy() - right["cell_methods"] = None - rmetadata = self.cls(**right) - lexpected = deepcopy(self.none)._asdict() - lexpected["cell_methods"] = ( - left["cell_methods"], - right["cell_methods"], + # NOTE: in these cases, the difference metadata will fail an == operation, + # because of the 'None' entries. + # But we can use metadata._asdict() and test that. + lexpected = self.none._asdict() + lexpected[fieldname] = diffentry + rexpected = lexpected.copy() + rexpected[fieldname] = diffentry[::-1] + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + if strict_result: + assert lmetadata.difference(rmetadata)._asdict() == lexpected + assert rmetadata.difference(lmetadata)._asdict() == rexpected + else: + # Expect NO differences + assert lmetadata.difference(rmetadata) is None + assert rmetadata.difference(lmetadata) is None + + def test_op_different__value(self, fieldname, op_leniency): + # One field has different value for lhs/rhs, both strict + lenient. + if fieldname == "attributes": + # Attribute behaviours are tested separately + pytest.skip() + + self.lvalues[fieldname] = mock.sentinel.value1 + self.rvalues[fieldname] = mock.sentinel.value2 + lmetadata = self.cls(**self.lvalues) + rmetadata = self.cls(**self.rvalues) + + # In all cases, this field should show a difference : leniency has no effect + ldiff_values = (mock.sentinel.value1, mock.sentinel.value2) + ldiff_metadata = self.none._asdict() + ldiff_metadata[fieldname] = ldiff_values + rdiff_metadata = self.none._asdict() + rdiff_metadata[fieldname] = ldiff_values[::-1] + + # Check both l+r and r+l + assert lmetadata.difference(rmetadata)._asdict() == ldiff_metadata + assert rmetadata.difference(lmetadata)._asdict() == rdiff_metadata + + def test_op_different__attribute_extra(self, op_leniency): + # One field has an extra attribute, both strict + lenient. + is_lenient = op_leniency == "lenient" + self.lvalues["attributes"] = {"_a_common_": self.dummy} + lmetadata = self.cls(**self.lvalues) + rvalues = deepcopy(self.lvalues) + rvalues["attributes"]["_b_extra_"] = mock.sentinel.extra + rmetadata = self.cls(**rvalues) + + if not is_lenient: + # In this case, attributes returns a "difference dictionary" + diffentry = tuple([{}, {"_b_extra_": mock.sentinel.extra}]) + lexpected = self.none._asdict() + lexpected["attributes"] = diffentry + rexpected = lexpected.copy() + rexpected["attributes"] = diffentry[::-1] + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + if is_lenient: + # It recognises no difference + assert lmetadata.difference(rmetadata) is None + assert rmetadata.difference(lmetadata) is None + else: + # As calculated above + assert lmetadata.difference(rmetadata)._asdict() == lexpected + assert rmetadata.difference(lmetadata)._asdict() == rexpected + + def test_op_different__attribute_value(self, op_leniency): + # lhs and rhs have different values for an attribute, both strict + lenient. + is_lenient = op_leniency == "lenient" + self.lvalues["attributes"] = { + "_a_common_": self.dummy, + "_b_extra_": mock.sentinel.value1, + } + lmetadata = self.cls(**self.lvalues) + self.rvalues["attributes"] = { + "_a_common_": self.dummy, + "_b_extra_": mock.sentinel.value2, + } + rmetadata = self.cls(**self.rvalues) + + # In this case, attributes returns a "difference dictionary" + diffentry = tuple( + [ + {"_b_extra_": mock.sentinel.value1}, + {"_b_extra_": mock.sentinel.value2}, + ] ) - rexpected = deepcopy(self.none)._asdict() - rexpected["cell_methods"] = lexpected["cell_methods"][::-1] - - with mock.patch("iris.common.metadata._LENIENT", return_value=False): - self.assertEqual( - lexpected, lmetadata.difference(rmetadata)._asdict() - ) - self.assertEqual( - rexpected, rmetadata.difference(lmetadata)._asdict() - ) + lexpected = self.none._asdict() + lexpected["attributes"] = diffentry + rexpected = lexpected.copy() + rexpected["attributes"] = diffentry[::-1] + + with mock.patch( + "iris.common.metadata._LENIENT", return_value=is_lenient + ): + # As calculated above -- same for both strict + lenient + assert lmetadata.difference(rmetadata)._asdict() == lexpected + assert rmetadata.difference(lmetadata)._asdict() == rexpected class Test_equal(tests.IrisTest): diff --git a/lib/iris/tests/unit/common/mixin/test_LimitedAttributeDict.py b/lib/iris/tests/unit/common/mixin/test_LimitedAttributeDict.py index 7416bb9da5..d29a120f35 100644 --- a/lib/iris/tests/unit/common/mixin/test_LimitedAttributeDict.py +++ b/lib/iris/tests/unit/common/mixin/test_LimitedAttributeDict.py @@ -20,7 +20,7 @@ class Test(tests.IrisTest): def setUp(self): - self.forbidden_keys = LimitedAttributeDict._forbidden_keys + self.forbidden_keys = LimitedAttributeDict.CF_ATTRS_FORBIDDEN self.emsg = "{!r} is not a permitted attribute" def test__invalid_keys(self): diff --git a/lib/iris/tests/unit/cube/test_Cube.py b/lib/iris/tests/unit/cube/test_Cube.py index b1eed4743e..5e513c2bd0 100644 --- a/lib/iris/tests/unit/cube/test_Cube.py +++ b/lib/iris/tests/unit/cube/test_Cube.py @@ -33,7 +33,7 @@ CellMethod, DimCoord, ) -from iris.cube import Cube +from iris.cube import Cube, CubeAttrsDict import iris.exceptions from iris.exceptions import ( AncillaryVariableNotFoundError, @@ -3436,5 +3436,31 @@ def test_fail_assign_duckcellmethod(self): self.cube.cell_methods = (test_object,) +class TestAttributesProperty: + def test_attrs_type(self): + # Cube attributes are always of a special dictionary type. + cube = Cube([0], attributes={"a": 1}) + assert type(cube.attributes) is CubeAttrsDict + assert cube.attributes == {"a": 1} + + def test_attrs_remove(self): + # Wiping attributes replaces the stored object + cube = Cube([0], attributes={"a": 1}) + attrs = cube.attributes + cube.attributes = None + assert cube.attributes is not attrs + assert type(cube.attributes) is CubeAttrsDict + assert cube.attributes == {} + + def test_attrs_clear(self): + # Clearing attributes leaves the same object + cube = Cube([0], attributes={"a": 1}) + attrs = cube.attributes + cube.attributes.clear() + assert cube.attributes is attrs + assert type(cube.attributes) is CubeAttrsDict + assert cube.attributes == {} + + if __name__ == "__main__": tests.main() diff --git a/lib/iris/tests/unit/cube/test_CubeAttrsDict.py b/lib/iris/tests/unit/cube/test_CubeAttrsDict.py new file mode 100644 index 0000000000..615de7b8e6 --- /dev/null +++ b/lib/iris/tests/unit/cube/test_CubeAttrsDict.py @@ -0,0 +1,407 @@ +# Copyright Iris contributors +# +# This file is part of Iris and is released under the BSD license. +# See LICENSE in the root of the repository for full licensing details. +"""Unit tests for the `iris.cube.CubeAttrsDict` class.""" + +import pickle + +import numpy as np +import pytest + +from iris.common.mixin import LimitedAttributeDict +from iris.cube import CubeAttrsDict +from iris.fileformats.netcdf.saver import _CF_GLOBAL_ATTRS + + +@pytest.fixture +def sample_attrs() -> CubeAttrsDict: + return CubeAttrsDict( + locals={"a": 1, "z": "this"}, globals={"b": 2, "z": "that"} + ) + + +def check_content(attrs, locals=None, globals=None, matches=None): + """ + Check a CubeAttrsDict for expected properties. + + Its ".globals" and ".locals" must match 'locals' and 'globals' args + -- except that, if 'matches' is provided, it is a CubeAttrsDict, whose + locals/globals *replace* the 'locals'/'globals' arguments. + + Check that the result is a CubeAttrsDict and, for both local + global parts, + * parts match for *equality* (==) but are *non-identical* (is not) + * order of keys matches expected (N.B. which is *not* required for equality) + """ + assert isinstance(attrs, CubeAttrsDict) + attr_locals, attr_globals = attrs.locals, attrs.globals + assert type(attr_locals) is LimitedAttributeDict + assert type(attr_globals) is LimitedAttributeDict + if matches: + locals, globals = matches.locals, matches.globals + + def check(arg, content): + if not arg: + arg = {} + if not isinstance(arg, LimitedAttributeDict): + arg = LimitedAttributeDict(arg) + # N.B. if 'arg' is an actual given LimitedAttributeDict, it is not changed.. + # .. we proceed to ensure that the stored content is equal but NOT the same + assert content == arg + assert content is not arg + assert list(content.keys()) == list(arg.keys()) + + check(locals, attr_locals) + check(globals, attr_globals) + + +class Test___init__: + def test_empty(self): + attrs = CubeAttrsDict() + check_content(attrs, None, None) + + def test_from_combined_dict(self): + attrs = CubeAttrsDict({"q": 3, "history": "something"}) + check_content(attrs, locals={"q": 3}, globals={"history": "something"}) + + def test_from_separate_dicts(self): + locals = {"q": 3} + globals = {"history": "something"} + attrs = CubeAttrsDict(locals=locals, globals=globals) + check_content(attrs, locals=locals, globals=globals) + + def test_from_cubeattrsdict(self, sample_attrs): + result = CubeAttrsDict(sample_attrs) + check_content(result, matches=sample_attrs) + + def test_from_cubeattrsdict_like(self): + class MyDict: + pass + + mydict = MyDict() + locals, globals = {"a": 1}, {"b": 2} + mydict.locals = locals + mydict.globals = globals + attrs = CubeAttrsDict(mydict) + check_content(attrs, locals=locals, globals=globals) + + +class Test_OddMethods: + def test_pickle(self, sample_attrs): + bytes = pickle.dumps(sample_attrs) + result = pickle.loads(bytes) + check_content(result, matches=sample_attrs) + + def test_clear(self, sample_attrs): + sample_attrs.clear() + check_content(sample_attrs, {}, {}) + + def test_del(self, sample_attrs): + # 'z' is in both locals+globals. Delete removes both. + assert "z" in sample_attrs.keys() + del sample_attrs["z"] + assert "z" not in sample_attrs.keys() + + def test_copy(self, sample_attrs): + copy = sample_attrs.copy() + assert copy is not sample_attrs + check_content(copy, matches=sample_attrs) + + @pytest.fixture(params=["regular_arg", "split_arg"]) + def update_testcase(self, request): + lhs = CubeAttrsDict(globals={"a": 1, "b": 2}, locals={"b": 3, "c": 4}) + if request.param == "split_arg": + # A set of "update settings", with global/local-specific keys. + rhs = CubeAttrsDict( + globals={"a": 1001, "x": 1007}, + # NOTE: use a global-default key here, to check that type is preserved + locals={"b": 1003, "history": 1099}, + ) + expected_result = CubeAttrsDict( + globals={"a": 1001, "b": 2, "x": 1007}, + locals={"b": 1003, "c": 4, "history": 1099}, + ) + else: + assert request.param == "regular_arg" + # A similar set of update values in a regular dict (so not local/global) + rhs = {"a": 1001, "x": 1007, "b": 1003, "history": 1099} + expected_result = CubeAttrsDict( + globals={"a": 1001, "b": 2, "history": 1099}, + locals={"b": 1003, "c": 4, "x": 1007}, + ) + return lhs, rhs, expected_result + + def test_update(self, update_testcase): + testval, updater, expected = update_testcase + testval.update(updater) + check_content(testval, matches=expected) + + def test___or__(self, update_testcase): + testval, updater, expected = update_testcase + original = testval.copy() + result = testval | updater + assert result is not testval + assert testval == original + check_content(result, matches=expected) + + def test___ior__(self, update_testcase): + testval, updater, expected = update_testcase + testval |= updater + check_content(testval, matches=expected) + + def test___ror__(self): + # Check the "or" operation, when lhs is a regular dictionary + lhs = {"a": 1, "b": 2, "history": 3} + rhs = CubeAttrsDict( + globals={"a": 1001, "x": 1007}, + # NOTE: use a global-default key here, to check that type is preserved + locals={"b": 1003, "history": 1099}, + ) + # The lhs should be promoted to a CubeAttrsDict, and then combined. + expected = CubeAttrsDict( + globals={"history": 3, "a": 1001, "x": 1007}, + locals={"a": 1, "b": 1003, "history": 1099}, + ) + result = lhs | rhs + check_content(result, matches=expected) + + @pytest.mark.parametrize("value", [1, None]) + @pytest.mark.parametrize("inputtype", ["regular_arg", "split_arg"]) + def test__fromkeys(self, value, inputtype): + if inputtype == "regular_arg": + # Check when input is a plain iterable of key-names + keys = ["a", "b", "history"] + # Result has keys assigned local/global via default mechanism. + expected = CubeAttrsDict( + globals={"history": value}, + locals={"a": value, "b": value}, + ) + else: + assert inputtype == "split_arg" + # Check when input is a CubeAttrsDict + keys = CubeAttrsDict( + globals={"a": 1}, locals={"b": 2, "history": 3} + ) + # The result preserves the input keys' local/global identity + # N.B. "history" would be global by default (cf. "regular_arg" case) + expected = CubeAttrsDict( + globals={"a": value}, + locals={"b": value, "history": value}, + ) + result = CubeAttrsDict.fromkeys(keys, value) + check_content(result, matches=expected) + + def test_to_dict(self, sample_attrs): + result = dict(sample_attrs) + expected = sample_attrs.globals.copy() + expected.update(sample_attrs.locals) + assert result == expected + + def test_array_copies(self): + array = np.array([3, 2, 1, 4]) + map = {"array": array} + attrs = CubeAttrsDict(map) + check_content(attrs, globals=None, locals=map) + attrs_array = attrs["array"] + assert np.all(attrs_array == array) + assert attrs_array is not array + + def test__str__(self, sample_attrs): + result = str(sample_attrs) + assert result == "{'b': 2, 'z': 'this', 'a': 1}" + + def test__repr__(self, sample_attrs): + result = repr(sample_attrs) + expected = ( + "CubeAttrsDict(" + "globals={'b': 2, 'z': 'that'}, " + "locals={'a': 1, 'z': 'this'})" + ) + assert result == expected + + +class TestEq: + def test_eq_empty(self): + attrs_1 = CubeAttrsDict() + attrs_2 = CubeAttrsDict() + assert attrs_1 == attrs_2 + + def test_eq_nonempty(self, sample_attrs): + attrs_1 = sample_attrs + attrs_2 = sample_attrs.copy() + assert attrs_1 == attrs_2 + + @pytest.mark.parametrize("aspect", ["locals", "globals"]) + def test_ne_missing(self, sample_attrs, aspect): + attrs_1 = sample_attrs + attrs_2 = sample_attrs.copy() + del getattr(attrs_2, aspect)["z"] + assert attrs_1 != attrs_2 + assert attrs_2 != attrs_1 + + @pytest.mark.parametrize("aspect", ["locals", "globals"]) + def test_ne_different(self, sample_attrs, aspect): + attrs_1 = sample_attrs + attrs_2 = sample_attrs.copy() + getattr(attrs_2, aspect)["z"] = 99 + assert attrs_1 != attrs_2 + assert attrs_2 != attrs_1 + + def test_ne_locals_vs_globals(self): + attrs_1 = CubeAttrsDict(locals={"a": 1}) + attrs_2 = CubeAttrsDict(globals={"a": 1}) + assert attrs_1 != attrs_2 + assert attrs_2 != attrs_1 + + def test_eq_dict(self): + # A CubeAttrsDict can be equal to a plain dictionary (which would create it) + vals_dict = {"a": 1, "b": 2, "history": "this"} + attrs = CubeAttrsDict(vals_dict) + assert attrs == vals_dict + assert vals_dict == attrs + + def test_ne_dict_local_global(self): + # Dictionary equivalence fails if the local/global assignments are wrong. + # sample dictionary + vals_dict = {"title": "b"} + # these attrs are *not* the same, because 'title' is global by default + attrs = CubeAttrsDict(locals={"title": "b"}) + assert attrs != vals_dict + assert vals_dict != attrs + + def test_empty_not_none(self): + # An empty CubeAttrsDict is not None, and does not compare to 'None' + # N.B. this for compatibility with the LimitedAttributeDict + attrs = CubeAttrsDict() + assert attrs is not None + with pytest.raises(TypeError, match="iterable"): + # Cannot *compare* to None (or anything non-iterable) + # N.B. not actually testing against None, as it upsets black (!) + attrs == 0 + + def test_empty_eq_iterables(self): + # An empty CubeAttrsDict is "equal" to various empty containers + attrs = CubeAttrsDict() + assert attrs == {} + assert attrs == [] + assert attrs == () + + +class TestDictOrderBehaviour: + def test_ordering(self): + attrs = CubeAttrsDict({"a": 1, "b": 2}) + assert list(attrs.keys()) == ["a", "b"] + # Remove, then reinstate 'a' : it will go to the back + del attrs["a"] + attrs["a"] = 1 + assert list(attrs.keys()) == ["b", "a"] + + def test_globals_locals_ordering(self): + # create attrs with a global attribute set *before* a local one .. + attrs = CubeAttrsDict() + attrs.globals.update(dict(a=1, m=3)) + attrs.locals.update(dict(f=7, z=4)) + # .. and check key order of combined attrs + assert list(attrs.keys()) == ["a", "m", "f", "z"] + + def test_locals_globals_nonalphabetic_order(self): + # create the "same" thing with locals before globals, *and* different key order + attrs = CubeAttrsDict() + attrs.locals.update(dict(z=4, f=7)) + attrs.globals.update(dict(m=3, a=1)) + # .. this shows that the result is not affected either by alphabetical key + # order, or the order of adding locals/globals + # I.E. result is globals-in-create-order, then locals-in-create-order + assert list(attrs.keys()) == ["m", "a", "z", "f"] + + +class TestSettingBehaviours: + def test_add_localtype(self): + attrs = CubeAttrsDict() + # Any attribute not recognised as global should go into 'locals' + attrs["z"] = 3 + check_content(attrs, locals={"z": 3}) + + @pytest.mark.parametrize("attrname", _CF_GLOBAL_ATTRS) + def test_add_globaltype(self, attrname): + # These specific attributes are recognised as belonging in 'globals' + attrs = CubeAttrsDict() + attrs[attrname] = "this" + check_content(attrs, globals={attrname: "this"}) + + def test_overwrite_local(self): + attrs = CubeAttrsDict({"a": 1}) + attrs["a"] = 2 + check_content(attrs, locals={"a": 2}) + + @pytest.mark.parametrize("attrname", _CF_GLOBAL_ATTRS) + def test_overwrite_global(self, attrname): + attrs = CubeAttrsDict({attrname: 1}) + attrs[attrname] = 2 + check_content(attrs, globals={attrname: 2}) + + @pytest.mark.parametrize("global_attrname", _CF_GLOBAL_ATTRS) + def test_overwrite_forced_local(self, global_attrname): + attrs = CubeAttrsDict(locals={global_attrname: 1}) + # The attr *remains* local, even though it would be created global by default + attrs[global_attrname] = 2 + check_content(attrs, locals={global_attrname: 2}) + + def test_overwrite_forced_global(self): + attrs = CubeAttrsDict(globals={"data": 1}) + # The attr remains global, even though it would be created local by default + attrs["data"] = 2 + check_content(attrs, globals={"data": 2}) + + def test_overwrite_both(self): + attrs = CubeAttrsDict(locals={"z": 1}, globals={"z": 1}) + # Where both exist, it will always update the local one + attrs["z"] = 2 + check_content(attrs, locals={"z": 2}, globals={"z": 1}) + + def test_local_global_masking(self, sample_attrs): + # initially, local 'z' masks the global one + assert sample_attrs["z"] == sample_attrs.locals["z"] + # remove local, global will show + del sample_attrs.locals["z"] + assert sample_attrs["z"] == sample_attrs.globals["z"] + # re-set local + sample_attrs.locals["z"] = "new" + assert sample_attrs["z"] == "new" + # change the global, makes no difference + sample_attrs.globals["z"] == "other" + assert sample_attrs["z"] == "new" + + @pytest.mark.parametrize("globals_or_locals", ("globals", "locals")) + @pytest.mark.parametrize( + "value_type", + ("replace", "emptylist", "emptytuple", "none", "zero", "false"), + ) + def test_replace_subdict(self, globals_or_locals, value_type): + # Writing to attrs.xx always replaces content with a *new* LimitedAttributeDict + locals, globals = {"a": 1}, {"b": 2} + attrs = CubeAttrsDict(locals=locals, globals=globals) + # Snapshot old + write new value, of either locals or globals + old_content = getattr(attrs, globals_or_locals) + value = { + "replace": {"qq": 77}, + "emptytuple": (), + "emptylist": [], + "none": None, + "zero": 0, + "false": False, + }[value_type] + setattr(attrs, globals_or_locals, value) + # check new content is expected type and value + new_content = getattr(attrs, globals_or_locals) + assert isinstance(new_content, LimitedAttributeDict) + assert new_content is not old_content + if value_type != "replace": + value = {} + assert new_content == value + # Check expected whole: i.e. either globals or locals was replaced with value + if globals_or_locals == "globals": + globals = value + else: + locals = value + check_content(attrs, locals=locals, globals=globals) diff --git a/lib/iris/tests/unit/fileformats/nc_load_rules/helpers/test_build_cube_metadata.py b/lib/iris/tests/unit/fileformats/nc_load_rules/helpers/test_build_cube_metadata.py index e2297be69e..973e10217b 100644 --- a/lib/iris/tests/unit/fileformats/nc_load_rules/helpers/test_build_cube_metadata.py +++ b/lib/iris/tests/unit/fileformats/nc_load_rules/helpers/test_build_cube_metadata.py @@ -41,7 +41,7 @@ def _make_engine(global_attributes=None, standard_name=None, long_name=None): return engine -class TestInvalidGlobalAttributes(tests.IrisTest): +class TestGlobalAttributes(tests.IrisTest): def test_valid(self): global_attributes = { "Conventions": "CF-1.5", @@ -50,7 +50,7 @@ def test_valid(self): engine = _make_engine(global_attributes) build_cube_metadata(engine) expected = global_attributes - self.assertEqual(engine.cube.attributes, expected) + self.assertEqual(engine.cube.attributes.globals, expected) def test_invalid(self): global_attributes = { @@ -64,13 +64,14 @@ def test_invalid(self): # Check for a warning. self.assertEqual(warn.call_count, 1) self.assertIn( - "Skipping global attribute 'calendar'", warn.call_args[0][0] + "Skipping disallowed global attribute 'calendar'", + warn.call_args[0][0], ) # Check resulting attributes. The invalid entry 'calendar' # should be filtered out. global_attributes.pop("calendar") expected = global_attributes - self.assertEqual(engine.cube.attributes, expected) + self.assertEqual(engine.cube.attributes.globals, expected) class TestCubeName(tests.IrisTest): diff --git a/lib/iris/tests/unit/util/test_equalise_attributes.py b/lib/iris/tests/unit/util/test_equalise_attributes.py index a4198160a9..de5308a7fa 100644 --- a/lib/iris/tests/unit/util/test_equalise_attributes.py +++ b/lib/iris/tests/unit/util/test_equalise_attributes.py @@ -13,8 +13,13 @@ import numpy as np -from iris.cube import Cube +from iris.coords import AuxCoord +from iris.cube import Cube, CubeAttrsDict import iris.tests.stock +from iris.tests.unit.common.metadata.test_CubeMetadata import ( + _TEST_ATTRNAME, + make_attrsdict, +) from iris.util import equalise_attributes @@ -152,5 +157,111 @@ def test_complex_somecommon(self): ) +class TestSplitattributes: + """ + Extra testing for cases where attributes differ specifically by type + + That is, where there is a new possibility of 'mismatch' due to the newer "typing" + of attributes as global or local. + + Specifically, it is now possible that although + "cube1.attributes.keys() == cube2.attributes.keys()", + AND "cube1.attributes[k] == cube2.attributes[k]" for all keys, + YET STILL (possibly) "cube1.attributes != cube2.attributes" + """ + + @staticmethod + def _sample_splitattrs_cube(attr_global_local): + attrs = CubeAttrsDict( + globals=make_attrsdict(attr_global_local[0]), + locals=make_attrsdict(attr_global_local[1]), + ) + return Cube([0], attributes=attrs) + + @staticmethod + def check_equalised_result(cube1, cube2): + equalise_attributes([cube1, cube2]) + # Note: "X" represents a missing attribute, as in test_CubeMetadata + return [ + ( + cube1.attributes.globals.get(_TEST_ATTRNAME, "X") + + cube1.attributes.locals.get(_TEST_ATTRNAME, "X") + ), + ( + cube2.attributes.globals.get(_TEST_ATTRNAME, "X") + + cube2.attributes.locals.get(_TEST_ATTRNAME, "X") + ), + ] + + def test__global_and_local__bothsame(self): + # A trivial case showing that the original globals+locals are both preserved. + cube1 = self._sample_splitattrs_cube("AB") + cube2 = self._sample_splitattrs_cube("AB") + result = self.check_equalised_result(cube1, cube2) + assert result == ["AB", "AB"] + + def test__globals_different(self): + cube1 = self._sample_splitattrs_cube("AX") + cube2 = self._sample_splitattrs_cube("BX") + result = self.check_equalised_result(cube1, cube2) + assert result == ["XX", "XX"] + + def test__locals_different(self): + cube1 = self._sample_splitattrs_cube("XA") + cube2 = self._sample_splitattrs_cube("XB") + result = self.check_equalised_result(cube1, cube2) + assert result == ["XX", "XX"] + + def test__oneglobal_onelocal__different(self): + cube1 = self._sample_splitattrs_cube("AX") + cube2 = self._sample_splitattrs_cube("XB") + result = self.check_equalised_result(cube1, cube2) + assert result == ["XX", "XX"] + + # This case fails without the split-attributes fix. + def test__oneglobal_onelocal__same(self): + cube1 = self._sample_splitattrs_cube("AX") + cube2 = self._sample_splitattrs_cube("XA") + result = self.check_equalised_result(cube1, cube2) + assert result == ["XX", "XX"] + + def test__sameglobals_onelocal__different(self): + cube1 = self._sample_splitattrs_cube("AB") + cube2 = self._sample_splitattrs_cube("AX") + result = self.check_equalised_result(cube1, cube2) + assert result == ["XX", "XX"] + + # This case fails without the split-attributes fix. + def test__sameglobals_onelocal__same(self): + cube1 = self._sample_splitattrs_cube("AA") + cube2 = self._sample_splitattrs_cube("AX") + result = self.check_equalised_result(cube1, cube2) + assert result == ["XX", "XX"] + + # This case fails without the split-attributes fix. + def test__differentglobals_samelocals(self): + cube1 = self._sample_splitattrs_cube("AC") + cube2 = self._sample_splitattrs_cube("BC") + result = self.check_equalised_result(cube1, cube2) + assert result == ["XX", "XX"] + + +class TestNonCube: + # Just to assert that we can do operations on non-cube components (like Coords), + # in fact effectively, anything with a ".attributes". + # Even though the docstring does not admit this, we test it because we put in + # special code to preserve it when adding the split-attribute handling. + def test(self): + attrs = [1, 1, 2] + coords = [ + AuxCoord([0], attributes={"a": attr, "b": "all_the_same"}) + for attr in attrs + ] + equalise_attributes(coords) + assert all( + coord.attributes == {"b": "all_the_same"} for coord in coords + ) + + if __name__ == "__main__": tests.main() diff --git a/lib/iris/util.py b/lib/iris/util.py index 4509f2885b..10a58fdef0 100644 --- a/lib/iris/util.py +++ b/lib/iris/util.py @@ -2071,24 +2071,50 @@ def equalise_attributes(cubes): See more at :doc:`/userguide/real_and_lazy_data`. """ - removed = [] + # deferred import to avoid circularity problem + from iris.common._split_attribute_dicts import ( + _convert_splitattrs_to_pairedkeys_dict, + ) + + cube_attrs = [cube.attributes for cube in cubes] + + # Convert all the input dictionaries to ones with 'paired' keys, so each key + # becomes a pair, ('local'/'global', attribute-name), making them specific to each + # "type", i.e. global or local. + # This is needed to ensure that afterwards all cubes will have identical + # attributes, E.G. it treats an attribute which is global on one cube and local + # on another as *not* the same. This is essential to its use in making merges work. + # + # This approach does also still function with "ordinary" dictionaries, or + # :class:`iris.common.mixin.LimitedAttributeDict`, though somewhat inefficiently, + # so the routine works on *other* objects bearing attributes, i.e. not just Cubes. + # That is also important since the original code allows that (though the docstring + # does not admit it). + cube_attrs = [ + _convert_splitattrs_to_pairedkeys_dict(dic) for dic in cube_attrs + ] + # Work out which attributes are identical across all the cubes. - common_keys = list(cubes[0].attributes.keys()) + common_keys = list(cube_attrs[0].keys()) keys_to_remove = set(common_keys) - for cube in cubes[1:]: - cube_keys = list(cube.attributes.keys()) + for attrs in cube_attrs[1:]: + cube_keys = list(attrs.keys()) keys_to_remove.update(cube_keys) common_keys = [ key for key in common_keys - if ( - key in cube_keys - and np.all(cube.attributes[key] == cubes[0].attributes[key]) - ) + if (key in cube_keys and np.all(attrs[key] == cube_attrs[0][key])) ] keys_to_remove.difference_update(common_keys) - # Remove all the other attributes. + # Convert back from the resulting 'paired' keys set, extracting just the + # attribute-name parts, as a set of names to be discarded. + # Note: we don't care any more what type (global/local) these were : we will + # simply remove *all* attributes with those names. + keys_to_remove = set(key_pair[1] for key_pair in keys_to_remove) + + # Remove all the non-matching attributes. + removed = [] for cube in cubes: deleted_attributes = { key: cube.attributes.pop(key) @@ -2096,6 +2122,7 @@ def equalise_attributes(cubes): if key in cube.attributes } removed.append(deleted_attributes) + return removed