-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor steps in blending code #443
base: master
Are you sure you want to change the base?
Conversation
…4 for the refactoring
Hi reviewers,
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #443 +/- ##
==========================================
+ Coverage 84.03% 84.22% +0.19%
==========================================
Files 160 160
Lines 13031 13243 +212
==========================================
+ Hits 10950 11154 +204
- Misses 2081 2089 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Also, @ladc and I found a possible bug in the code that needs to be fixed. I added a test to the test files but commented it out for the moment as this would break the tests (also in the old version of the code) |
@sidekock, fantastic work! I'll try to give it a good review before the Christmas break. |
could you say a bit more about this bug @sidekock ? is it something that we should address in a separate bug fix? |
I should check it to be sure but don't have my computer with me now but is has to do with iterations over model numbers vs over duplicated model numbers. We just missed this issue in the tests because it is something that does not frequently happen. A possible reason this could fail is when you have a 2 NWP members but want to generate a blended nowcast with 4 members. This is a small bugfix I can do but I wanted to wait until this issue is tackled fully. It is one of the #TODO comments I added to the code so if you want you can take a look through the code. The specific TODO is this one: n_models=2, timesteps=3, n_ens_members=4, n_cascade_levels=8, mask_method="incremental", probmatching_method="cdf", blend_nwp_members=True&False, weights_method="spn", decomposed_nwp=True, expected_n_ens_members=2, zero_radar=False, zero_nwp=False, smooth_radar_mask_range=0, resample_distribution=False), |
Co-authored-by: mats-knmi <[email protected]>
I have already noted some initial things I noticed, but other than that it looks really good, much more readable than before. |
Indeed a very big file :) Ill do the docstrings one of the next days and try to tackle the duplicate comments at that point too |
# 8.5 Blend the cascades | ||
final_blended_forecast_single_member = [] | ||
for t_sub in self.__state.subtimesteps: | ||
# TODO: does it make sense to use sub time steps - check if it works? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the sub timesteps: I would love it if we could make it so that pysteps can just handle a variable timestep.
When you would go for example from a 5 to 15 minute timestep you could aggregate the previous 3 precipitation fields and multiply the last motion field by 3 and you would then probably just be able to continue in a 15 minute timestep from there right?
This would be much preferable to the sub timestep, which still has to do all the computations at the smaller timesteps. Also it should make the code easier to follow maybe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is part of the refactoring, but maybe something for after
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, but I think the refactoring makes the process a lot easier to to this.
I just scrolled though the entire code. I did not read everything, but I mostly tried to follow along. I have added some additonal comments to the things I found, other than that it is good to go from my part (after @RubenImhoff also has had a look). |
….velocity_perturbations = [] in __initialize_random_generators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sidekock, super nice work! As @mats-knmi already mentioned, it is a lot of code, so I'm sure I've missed things. My main recommendation would be do to some comparison tests to ensure all functionality still operates the same.
In addition, some overall points I had:
- Docstrings: already mentioned, but good to make this as clear as possible to in the end have a blending module that is more understandable than the old version was. :)
- Also, don't forget to sometimes add some short docstrings line-by-line in the individual functions.
- Question: What internal functions in this piece of code do have overlap (or more or less overlap) with
nowcasts/steps.py
. Meaning, could we put some functions separate in a utils script or so?
can be given as float32. They will then be converted to float64 before computations | ||
to minimize loss in precision. | ||
# TODO: compare old and new version of the code, run a benchmark to compare the two | ||
# TODO: look at the documentation and try to improve it, lots of things are now combined together |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you still planning to do that as part of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed that is the goal. Based on the feedback on the code I would first change the code and the, when it is in a final state. I will first compare master to this branch and then focus on documentation
""" | ||
### | ||
# 8. Start the forecasting loop | ||
### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments above can be removed.
self.__perturb_blend_and_advect_extrapolation_and_noise_to_current_timestep( | ||
t, j, worker_state | ||
) | ||
# 8.5 Blend the cascades |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# 8.5 Blend the cascades | |
# Blend the cascades |
# latest extrapolated radar rainfall field blended with the | ||
# nwp model(s) rainfall forecast fields as 'benchmark'. | ||
|
||
# 8.7.1 first blend the extrapolated rainfall field (the field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# 8.7.1 first blend the extrapolated rainfall field (the field | |
# First blend the extrapolated rainfall field (the field |
Perhaps give this the variable name, instead of extrapolated rainfall field, as it occurs in the class.
precip_forecast_probability_matching_blended | ||
) | ||
|
||
# 8.7.2. Apply the masking and prob. matching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# 8.7.2. Apply the masking and prob. matching | |
# Apply the masking and probability matching |
pysteps/blending/steps.py
Outdated
return out | ||
|
||
|
||
def calculate_weights_bps(correlations): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we are cleaning up the code, would it make sense to move calculate_weights_bps
and calculate_weights_spn
to a separate model, called weights.py
or so? However, at the same time, these weights are quite specific to the STEPS approach, so I'm not sure what is best. It would make the code here a little shorter and cleaner at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure where to move it best. Two other functions are used a few times as well and I don't know where to move them to... Some, I would say, could maybe be added to utils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll think about it. Maybe I'll put them in a separate weights.py
module, but otherwise they're good where they are now.
Hi ruben, |
This is the refactored version of the blending code of steps, see issue #440