Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitattrs ncsave redo #5410

Merged

Conversation

pp-mo
Copy link
Member

@pp-mo pp-mo commented Aug 1, 2023

Closes #4986

Rework of #5404 with ...

  • new unified testcase setup + checking code for existing tests.
  • saver split-attributes control (but no testing yet)

STILL TO DO :

  • add additional TestSave testcases, specific to newstyle global+local cube attributes.
  • parametrise TestSave and TestRoundtrip to test with/without FUTURE.save_split_attrs enabled
    • (and capture the differences)
  • add "matrix" testing for all combinations of (attribute class, test operation, with/without split-attrs-saving)
    • against a captured results "snapshot" (to detect future changes)

@pp-mo pp-mo changed the base branch from main to FEATURE_split_attrs August 1, 2023 18:44
@pp-mo pp-mo force-pushed the splitattrs_ncsave_redo branch from d0a330a to 068bd48 Compare August 1, 2023 18:45
@codecov
Copy link

codecov bot commented Aug 1, 2023

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (68eaa53) 89.41% compared to head (eea99d1) 89.42%.
Report is 1 commits behind head on FEATURE_split_attrs.

Additional details and impacted files
@@                   Coverage Diff                   @@
##           FEATURE_split_attrs    #5410      +/-   ##
=======================================================
+ Coverage                89.41%   89.42%   +0.01%     
=======================================================
  Files                       89       89              
  Lines                    22500    22543      +43     
  Branches                  5396     5410      +14     
=======================================================
+ Hits                     20119    20160      +41     
- Misses                    1636     1638       +2     
  Partials                   745      745              
Files Coverage Δ
lib/iris/cube.py 90.98% <100.00%> (+0.01%) ⬆️
lib/iris/__init__.py 88.97% <50.00%> (+0.08%) ⬆️
lib/iris/fileformats/netcdf/saver.py 89.48% <93.84%> (+0.22%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pp-mo pp-mo force-pushed the splitattrs_ncsave_redo branch 2 times, most recently from 5d82b1f to 8ddc88d Compare August 2, 2023 23:02
@pp-mo pp-mo mentioned this pull request Aug 2, 2023
@pp-mo pp-mo force-pushed the splitattrs_ncsave_redo branch from ed17fb5 to 78aaebb Compare August 2, 2023 23:31
@pp-mo pp-mo mentioned this pull request Aug 2, 2023
@trexfeathers trexfeathers self-requested a review August 3, 2023 13:38
@trexfeathers trexfeathers self-assigned this Aug 3, 2023
lib/iris/__init__.py Outdated Show resolved Hide resolved
@pp-mo pp-mo force-pushed the splitattrs_ncsave_redo branch from b07af3f to 54f344c Compare August 3, 2023 16:24
Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK here are some more comments. I still need to look at the tests.

The changes here demonstrate how much better it is to now have the CubeAttrsDict managing things, allowing the user to see what is happening during their scripts, rather than a load of 'magic' happening during save.

lib/iris/fileformats/netcdf/saver.py Outdated Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Outdated Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Outdated Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Outdated Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Outdated Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Outdated Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Outdated Show resolved Hide resolved
lib/iris/fileformats/netcdf/saver.py Show resolved Hide resolved
Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I have read through the tests as they are currently and they make sense. I hope to spend some more time thinking about anything that needs to be added (different to just reading what is in front of me!)

@pp-mo pp-mo force-pushed the splitattrs_ncsave_redo branch from d305f30 to 19a2956 Compare August 21, 2023 10:55
@pp-mo
Copy link
Member Author

pp-mo commented Aug 22, 2023

N.B. the pass for commit 8987a26 tests the new code against snapshotted behaviours from v3.6.1.
This shows that the new code does not introduce any behaviour changes to the programmed testcases, when reading cube attributes as a single dict and not enabling save-split-attrs.

However, the matrix testing does not currently cover all the "special-case" attributes which the loader and saver may treat differently, e.g. "Conventions" or "STASH".
If it is thought necessary, we could still add these to matrix testing (as further unique attribute "styles"), or we could add specific tests for (some usecases of) them .

@pp-mo
Copy link
Member Author

pp-mo commented Aug 22, 2023

The "matrix" testing obviously adds a lot of tests (~2000). And these are integrational tests all of which read+write test netcdf files. Which obviously raises some performance concerns,
However, the measured effects on performance are not seen to be that big. Perhaps they parallelise well ? ...

timings and counts of total tests :

/python 3.11 3.10 3.9 n-tests
commits
(PR)
cbd7167 07:46 04:50 04:25 9172
8987a26 08:12 05:51 05:44 9172
(main)
687f4a 10:19 06:30 07:16 6995
ff897e 08:10 05:55 06:01
368c2d 07:25 05:09 05:05
(v3.6.1)
d6c6c9 06:00 03:54 04:12 6981

@pp-mo pp-mo marked this pull request as ready for review August 22, 2023 09:48
@pp-mo pp-mo linked an issue Aug 22, 2023 that may be closed by this pull request
Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at the changes from fb343ae onwards and it looks good to me! I'm happy that you have written something that can capture any change in behaviour going forward, as well as demonstrating how behaviour has changed as part of this work. Great job!

lib/iris/tests/integration/test_netcdf__loadsaveattrs.py Outdated Show resolved Hide resolved
@trexfeathers
Copy link
Contributor

@pp-mo is there anything left outstanding that you want to do before I merge this?

@pp-mo
Copy link
Member Author

pp-mo commented Oct 4, 2023

@pp-mo is there anything left outstanding that you want to do before I merge this?

No nothing specific. If you're happy then we can bank it, which will be a relief !
I will then re-merge from latest main, and submit a proper PR for #5444 - which I think is the final piece of the jigsaw.

@trexfeathers trexfeathers merged commit fa7962e into SciTools:FEATURE_split_attrs Oct 10, 2023
15 checks passed
@trexfeathers
Copy link
Contributor

Alright, that's merged. Ready for the next tranche you have mentioned 📭

@pp-mo
Copy link
Member Author

pp-mo commented Oct 10, 2023

@trexfeathers fantastic !
Coming right up ...

ESadek-MO pushed a commit that referenced this pull request Nov 21, 2023
* Split-attrs: Cube metadata refactortests (#4993)

* Convert Test___eq__ to pytest.

* Convert Test_combine to pytest.

* Convert Test_difference to pytest.

* Review changes.

* Split attrs - tests for status quo (#4960)

* Tests for attribute handling in netcdf load/save.

* Tidy test functions.

* Fix import order exception.

* Add cf-global attributes test.

* Towards more pytest-y implemenation.

* Replace 'create_testcase' with fixture which also handles temporary directory.

* Much tidy; use fixtures to parametrise over multiple attributes.

* Fix warnings; begin data-style attrs tests.

* Tests for data-style attributes.

* Simplify setup fixture + improve docstring.

* No parallel test runner, to avoid error for Python>3.8.

* Fixed for new-style netcdf module.

* Small review changes.

* Rename attributes set 'data-style' as 'local-style'.

* Simplify use of fixtures; clarify docstrings/comments and improve argument names.

* Clarify testing sections for different attribute 'styles'.

* Re-enable parallel testing.

* Sorted params to avoid parallel testing bug - pytest#432.

* Rename test functions to make alpha-order match order in class.

* Split netcdf load/save attribute testing into separate sourcefile.

* Add tests for loaded cube attributes; refactor to share code between Load and Roundtrip tests.

* Add tests for attribute saving.

* Fix method names in comments.

* Clarify source of Conventions attributes.

* Explain the test numbering in TestRoundtrip/TestLoad.

* Remove obsolete test helper method.

* Fix small typo; Fix numbering of testcases in TestSave.

* Implement split cube attributes. (#5040)

* Implement split cube attributes.

* Test fixes.

* Modify examples for simpler metadata printouts.

* Added tests, small behaviour fixes.

* Simplify copy.

* Fix doctests.

* Skip doctests with non-replicable outputs (from use of sets).

* Tidy test comments, and add extra test.

* Tiny typo.

* Remove redundant redefinition of Cube.attributes.

* Add CubeAttrsDict in module __all__ + improve docs coverage.

* Review changes - small test changes.

* More review changes.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CubeAttrsDict example docstrings.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Odd small fixes.

* Improved docstrings and comments; fix doctests.

* Don't sidestep netcdf4 thread-safety.

* Publicise LimitedAttributeDict, so CubeAttrsDict can refer to it.

* Fix various internal + external links.

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <[email protected]>

* Streamline docs.

* Review changes.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Martin Yeo <[email protected]>

* Splitattrs ncload (#5384)

* Distinguish local+global attributes in netcdf loads.

* Small test fixes.

* Small doctest fix.

* Fix attribute load-save tests for new behaviour, and old-behaviour equivalence.

* Split attrs docs (#5418)

* Clarification in CubeAttrsDict examples.

* CubeAttrsDict fix docstring typo.

* Raise awareness of split attributes in user guide.

* What's New entry.

* Changes to metadata documentation.

* Splitattrs ncsave redo (#5410)

* Add docs and future switch, no function yet.

* Typing enables code completion for Cube.attributes.

* Make roundtrip checking more precise + improve some tests accordingly (cf. #5403).

* Rework all tests to use common setup + results-checking code.

* Saver supports split-attributes saving (no tests yet).

* Tiny docs fix.

* Explain test routines better.

* Fix init of FUTURE object.

* Remove spurious re-test of FUTURE.save_split_attrs.

* Don't create Cube attrs of 'None' (n.b. but no effect as currently used).

* Remove/repair refs to obsolete routines.

* Check all warnings from save operations.

* Remove TestSave test numbers.

* More save cases: no match with missing, and different cube attribute types.

* Run save/roundtrip tests both with+without split saves.

* Fix.

* Review changes.

* Fix changed warning messages.

* Move warnings checking from 'run' to 'check' phase.

* Simplify and improve warnings checking code.

* Fix wrong testcase.

* Minor review changes.

* Fix reverted code.

* Use sets to simplify demoted-attributes code.

* WIP

* Working with iris 3.6.1, no errors TestSave or TestRoundtrip.

* Interim save (incomplete?).

* Different results form for split tests; working for roundtrip.

* Check that all param lists are sorted.

* Check matrix result-files compatibility; add test_save_matrix.

* test_load_matrix added; two types of load result.

* Finalise special-case attributes.

* Small docs tweaks.

* Add some more testcases,

* Ensure valid sort-order for globals of possibly different types.

* Initialise matrix results with legacy values from v3.6.1 -- all matching.

* Add full current matrix results, i.e. snapshot current behaviours.

* Review changes : rename some matrix testcases, for clarity.

* Splitattrs ncsave redo commonmeta (#5538)

* Define common-metadata operartions on split attribute dictionaries.

* Tests for split-attributes handling in CubeMetadata operations.

* Small tidy and clarify.

* Common metadata ops support mixed split/unsplit attribute dicts.

* Clarify with better naming, comments, docstrings.

* Remove split-attrs handling to own sourcefile, and implement as a decorator.

* Remove redundant tests duplicated by matrix testcases.

* Newstyle split-attrs matrix testing, with fewer testcases.

* Small improvements to comments + docstrings.

* Fix logic for equals expectation; expand primary/secondary independence test.

* Clarify result testing in metadata operations decorator.

* Splitattrs equalise (#5586)

* Add tests in advance for split-attributes handling cases.

* Move dict conversion inside utility, for use elsewhere.

* Add support for split-attributes to equalise_attributes.

* Update lib/iris/util.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/tests/unit/util/test_equalise_attributes.py

Co-authored-by: Martin Yeo <[email protected]>

* Simplify and clarify equalise_attributes code.

---------

Co-authored-by: Martin Yeo <[email protected]>

* Fix merge-fail messaging for attribute mismatches. (#5590)

* Extra CubeAttrsDict methods to emulate dictionary behaviours. (#5592)

* Extra CubeAttrsDict methods to emulate dictionary behaviours.

* Don't use staticmethod on fixture.

* Add Iris warning categories to saver warnings.

* Type equality fixes for new flake8.

* Licence header fixes.

* Splitattrs ncsave deprecation (#5595)

* Small improvement to split-attrs whatsnew.

* Emit deprecation warning when saving without split-attrs enabled.

* Stop legacy-split-attribute warnings from upsetting delayed-saving tests.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Martin Yeo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: 🏁 Done
Development

Successfully merging this pull request may close these issues.

split-attrs: netcdf saver
2 participants