Refactor steps in blending code #443

sidekock · 2024-12-06T16:28:56Z

This is the refactored version of the blending code of steps, see issue #440

…4 for the refactoring

sidekock · 2024-12-06T16:33:39Z

Hi reviewers,
The blending code is finally refactored. The following things still need to happen but I would like to do these during the review process:

Documentation needs to be updated to the new structure (please let me know where more documentation is needed)
@RubenImhoff could you please take a look at the # TODO's, I added some and some remain from the original version. I would like to remove most of these if possible. Some are still there for me to check I don't forget last checks.
Look forward to all the feedback!

codecov · 2024-12-06T16:40:03Z

Codecov Report

Attention: Patch coverage is 72.22222% with 10 lines in your changes missing coverage. Please review.

Project coverage is 84.22%. Comparing base (8b5333c) to head (1b82512).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
pysteps/nowcasts/steps.py	72.22%	10 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #443      +/-   ##
==========================================
+ Coverage   84.03%   84.22%   +0.19%     
==========================================
  Files         160      160              
  Lines       13031    13243     +212     
==========================================
+ Hits        10950    11154     +204     
- Misses       2081     2089       +8

Flag	Coverage Δ
unit_tests	`84.22% <72.22%> (+0.19%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sidekock · 2024-12-06T16:59:23Z

Also, @ladc and I found a possible bug in the code that needs to be fixed. I added a test to the test files but commented it out for the moment as this would break the tests (also in the old version of the code)

RubenImhoff · 2024-12-07T07:34:59Z

@sidekock, fantastic work! I'll try to give it a good review before the Christmas break.

dnerini · 2024-12-15T12:58:50Z

Also, @ladc and I found a possible bug in the code that needs to be fixed. I added a test to the test files but commented it out for the moment as this would break the tests (also in the old version of the code)

could you say a bit more about this bug @sidekock ? is it something that we should address in a separate bug fix?

sidekock · 2024-12-15T14:53:12Z

Also, @ladc and I found a possible bug in the code that needs to be fixed. I added a test to the test files but commented it out for the moment as this would break the tests (also in the old version of the code)

could you say a bit more about this bug @sidekock ? is it something that we should address in a separate bug fix?

I should check it to be sure but don't have my computer with me now but is has to do with iterations over model numbers vs over duplicated model numbers. We just missed this issue in the tests because it is something that does not frequently happen. A possible reason this could fail is when you have a 2 NWP members but want to generate a blended nowcast with 4 members. This is a small bugfix I can do but I wanted to wait until this issue is tackled fully. It is one of the #TODO comments I added to the code so if you want you can take a look through the code.

The specific TODO is this one:
"TODO: check if j is the best accessor for this variable"
The type of test it failed (which is not yet present in the master branch) is this one:

n_models=2, timesteps=3, n_ens_members=4, n_cascade_levels=8, mask_method="incremental", probmatching_method="cdf", blend_nwp_members=True&False, weights_method="spn", decomposed_nwp=True, expected_n_ens_members=2, zero_radar=False, zero_nwp=False, smooth_radar_mask_range=0, resample_distribution=False),

.gitignore

Co-authored-by: mats-knmi <[email protected]>

pysteps/blending/steps.py

mats-knmi · 2024-12-16T15:44:20Z

I have already noted some initial things I noticed, but other than that it looks really good, much more readable than before.
I will try to find some more time to look into this more in depth (it is a really big file to review haha).
Regarding docstrings: I think what we did for the steps code still makes sense here right? Adding a docstring to StepsBlendingConfig and one to compute_forecast to explain what all the variables do?

sidekock · 2024-12-16T20:17:45Z

I have already noted some initial things I noticed, but other than that it looks really good, much more readable than before. I will try to find some more time to look into this more in depth (it is a really big file to review haha). Regarding docstrings: I think what we did for the steps code still makes sense here right? Adding a docstring to StepsBlendingConfig and one to compute_forecast to explain what all the variables do?

Indeed a very big file :) Ill do the docstrings one of the next days and try to tackle the duplicate comments at that point too

mats-knmi · 2024-12-17T07:38:14Z

pysteps/blending/steps.py

+                # 8.5 Blend the cascades
+                final_blended_forecast_single_member = []
+                for t_sub in self.__state.subtimesteps:
+                    # TODO: does it make sense to use sub time steps - check if it works?


Regarding the sub timesteps: I would love it if we could make it so that pysteps can just handle a variable timestep.

When you would go for example from a 5 to 15 minute timestep you could aggregate the previous 3 precipitation fields and multiply the last motion field by 3 and you would then probably just be able to continue in a 15 minute timestep from there right?

This would be much preferable to the sub timestep, which still has to do all the computations at the smaller timesteps. Also it should make the code easier to follow maybe

I don't think this is part of the refactoring, but maybe something for after

Indeed, but I think the refactoring makes the process a lot easier to to this.

pysteps/blending/steps.py

mats-knmi · 2024-12-17T09:57:42Z

I just scrolled though the entire code. I did not read everything, but I mostly tried to follow along. I have added some additonal comments to the things I found, other than that it is good to go from my part (after @RubenImhoff also has had a look).

….velocity_perturbations = [] in __initialize_random_generators

pysteps/blending/steps.py

RubenImhoff

@sidekock, super nice work! As @mats-knmi already mentioned, it is a lot of code, so I'm sure I've missed things. My main recommendation would be do to some comparison tests to ensure all functionality still operates the same.

In addition, some overall points I had:

Docstrings: already mentioned, but good to make this as clear as possible to in the end have a blending module that is more understandable than the old version was. :)
Also, don't forget to sometimes add some short docstrings line-by-line in the individual functions.
Question: What internal functions in this piece of code do have overlap (or more or less overlap) with nowcasts/steps.py. Meaning, could we put some functions separate in a utils script or so?

RubenImhoff · 2024-12-16T15:58:27Z

pysteps/blending/steps.py

-      can be given as float32. They will then be converted to float64 before computations
-      to minimize loss in precision.
+# TODO: compare old and new version of the code, run a benchmark to compare the two
+# TODO: look at the documentation and try to improve it, lots of things are now combined together


Are you still planning to do that as part of this PR?

Indeed that is the goal. Based on the feedback on the code I would first change the code and the, when it is in a final state. I will first compare master to this branch and then focus on documentation

pysteps/blending/steps.py

RubenImhoff · 2024-12-16T17:41:07Z

pysteps/blending/steps.py

+        """
+        ###
+        # 8. Start the forecasting loop
+        ###


The comments above can be removed.

RubenImhoff · 2024-12-16T17:41:55Z

pysteps/blending/steps.py

+                self.__perturb_blend_and_advect_extrapolation_and_noise_to_current_timestep(
+                    t, j, worker_state
+                )
+                # 8.5 Blend the cascades


Suggested change

# 8.5 Blend the cascades

# Blend the cascades

pysteps/blending/steps.py

RubenImhoff · 2024-12-17T10:22:50Z

pysteps/blending/steps.py

+        # latest extrapolated radar rainfall field blended with the
+        # nwp model(s) rainfall forecast fields as 'benchmark'.
+
+        # 8.7.1 first blend the extrapolated rainfall field (the field


Suggested change

# 8.7.1 first blend the extrapolated rainfall field (the field

# First blend the extrapolated rainfall field (the field

Perhaps give this the variable name, instead of extrapolated rainfall field, as it occurs in the class.

RubenImhoff · 2024-12-17T10:23:43Z

pysteps/blending/steps.py

+            precip_forecast_probability_matching_blended
+        )
+
+        # 8.7.2. Apply the masking and prob. matching


Suggested change

# 8.7.2. Apply the masking and prob. matching

# Apply the masking and probability matching

RubenImhoff · 2024-12-17T10:35:49Z

pysteps/blending/steps.py

+    return out
+
+
+def calculate_weights_bps(correlations):


Now that we are cleaning up the code, would it make sense to move calculate_weights_bps and calculate_weights_spn to a separate model, called weights.py or so? However, at the same time, these weights are quite specific to the STEPS approach, so I'm not sure what is best. It would make the code here a little shorter and cleaner at least.

I am not sure where to move it best. Two other functions are used a few times as well and I don't know where to move them to... Some, I would say, could maybe be added to utils

I'll think about it. Maybe I'll put them in a separate weights.py module, but otherwise they're good where they are now.

pysteps/blending/steps.py

…to come later

sidekock · 2024-12-19T09:43:36Z

@sidekock, super nice work! As @mats-knmi already mentioned, it is a lot of code, so I'm sure I've missed things. My main recommendation would be do to some comparison tests to ensure all functionality still operates the same.

In addition, some overall points I had:
* Docstrings: already mentioned, but good to make this as clear as possible to in the end have a blending module that is more understandable than the old version was. :)

* Also, don't forget to sometimes add some short docstrings line-by-line in the individual functions.

* Question: What internal functions in this piece of code do have overlap (or more or less overlap) with `nowcasts/steps.py`. Meaning, could we put some functions separate in a utils script or so?

Hi ruben,
Docstrings are the thing to look at if the code has a finalized form but that does not seem to far away now :). Regarding your last question: conceptually there is quite a bit of similarity between nowcasting and blending steps. However there are very few pieces of code that are duplicate, this is because we have additional NWP fields to take into account. Stylisticly, I agree that we should keep the duplicated code to a minimum but readability wise and for understanding, I also see the benefit of keeping all the steps of the workflow in one file. If this range would be really needed, we could also make a new issue out of this, because i think it is probably something we do best on its own.

…xed seed assingments

sidekock added 20 commits November 18, 2024 11:28

Refactored all names in the steps blending code from old to new

2066f14

Made some name changes but test still do not pass

72d0fbc

Fixed naming changes, now the tests pass

1ce563e

Built the rough scaffolding for the blending class

fbe551b

Refactored untill no rain case

46a93e5

Added code to estimation of ar parameters of radar

1eede39

Next go, start with forecast loop #7

a18f1f6

Added some uniformity between nowcast and blending steps. Now at # 8.…

8d16c11

…4 for the refactoring

Small changes since prev commit

88df97d

All code is tranfered. Last part of the main loop needs to be refactored

7ee0020

Everything is refactored, no test ran as of yet

f387981

Old forecast function is updated to fit newly refactored code

760c185

Removed old code which is no longer used

8d8905a

6 more tests that fail

d6249f5

All tests pass, still need to fix TODOs

38702b3

Updated gitignore

5ff1713

Cleanup of params and state dataclasses, next step: better typing

d999501

Cleanup of params and state dataclasses, now all tests pass

ed20ecc

Added correct typing to all parts of params and state

701e726

Ready for pull request

b9de511

sidekock added the enhancement New feature or request label Dec 6, 2024

sidekock requested review from ladc, RubenImhoff and mats-knmi December 6, 2024 16:28

sidekock self-assigned this Dec 6, 2024

sidekock linked an issue Dec 6, 2024 that may be closed by this pull request

Refactor steps in blending code #440

Open

Made changes for Codacy review

38ed195

Added aditional tests which currently fail in master branch

32b656f

mats-knmi reviewed Dec 16, 2024

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

Update .gitignore

4fe9f78

Co-authored-by: mats-knmi <[email protected]>

mats-knmi reviewed Dec 16, 2024

View reviewed changes

pysteps/blending/steps.py Outdated Show resolved Hide resolved

mats-knmi reviewed Dec 16, 2024

View reviewed changes

pysteps/blending/steps.py Show resolved Hide resolved

sidekock added 3 commits December 16, 2024 18:23

Used the __zero_precip_time in __zero_precipitation_forecast()

b31d55c

Changed typing hints to python 3.10+ version

cc02593

Added comments back to the State dataclass

4e4a148

mats-knmi reviewed Dec 17, 2024

View reviewed changes

pysteps/blending/steps.py Outdated Show resolved Hide resolved

mats-knmi reviewed Dec 17, 2024

View reviewed changes

pysteps/blending/steps.py Outdated Show resolved Hide resolved

mats-knmi reviewed Dec 17, 2024

View reviewed changes

pysteps/blending/steps.py Outdated Show resolved Hide resolved

Changed the self.__state.velocity_perturbations = [] to self.__params…

0f4e037

….velocity_perturbations = [] in __initialize_random_generators

mats-knmi reviewed Dec 17, 2024

View reviewed changes

pysteps/blending/steps.py Outdated Show resolved Hide resolved

RubenImhoff reviewed Dec 17, 2024

View reviewed changes

Added code changes as suggested by Ruben, comments and documentation …

9f413aa

…to come later

sidekock added 2 commits December 19, 2024 11:47

Added frozen functionality to dataclasses, removed reset_state and fi…

c72d953

…xed seed assingments

Added frozen dataclass to nowcast

00f057b

mats-knmi approved these changes Dec 19, 2024

View reviewed changes

The needed checks are done for this TODO so it can be removed

1b82512

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor steps in blending code #443

Refactor steps in blending code #443

sidekock commented Dec 6, 2024

sidekock commented Dec 6, 2024

codecov bot commented Dec 6, 2024 •

edited

Loading

sidekock commented Dec 6, 2024

RubenImhoff commented Dec 7, 2024

dnerini commented Dec 15, 2024

sidekock commented Dec 15, 2024 •

edited

Loading

mats-knmi commented Dec 16, 2024

sidekock commented Dec 16, 2024

mats-knmi Dec 17, 2024

mats-knmi Dec 17, 2024

sidekock Dec 17, 2024

mats-knmi commented Dec 17, 2024

RubenImhoff left a comment

RubenImhoff Dec 16, 2024

sidekock Dec 18, 2024

RubenImhoff Dec 16, 2024

RubenImhoff Dec 16, 2024

RubenImhoff Dec 17, 2024

RubenImhoff Dec 17, 2024

RubenImhoff Dec 17, 2024

sidekock Dec 18, 2024

RubenImhoff Dec 19, 2024

sidekock commented Dec 19, 2024

	# 8.7.1 first blend the extrapolated rainfall field (the field
	# First blend the extrapolated rainfall field (the field

	# 8.7.2. Apply the masking and prob. matching
	# Apply the masking and probability matching

Refactor steps in blending code #443

Are you sure you want to change the base?

Refactor steps in blending code #443

Conversation

sidekock commented Dec 6, 2024

sidekock commented Dec 6, 2024

codecov bot commented Dec 6, 2024 • edited Loading

Codecov Report

sidekock commented Dec 6, 2024

RubenImhoff commented Dec 7, 2024

dnerini commented Dec 15, 2024

sidekock commented Dec 15, 2024 • edited Loading

mats-knmi commented Dec 16, 2024

sidekock commented Dec 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mats-knmi commented Dec 17, 2024

RubenImhoff left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sidekock commented Dec 19, 2024

codecov bot commented Dec 6, 2024 •

edited

Loading

sidekock commented Dec 15, 2024 •

edited

Loading