Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

+Refactor spatial means and rescale some global integrals #778

Open
wants to merge 2 commits into
base: dev/gfdl
Choose a base branch
from

Conversation

Hallberg-NOAA
Copy link
Member

This PR consists of two commits that use the unscale arguments to reproducing_sums() to refactor the 6 global area mean functions (global_area_mean(), global_area_mean_u(), global_area_mean_v(), global_layer_mean(), global_volume_mean() and adjust_area_mean_to_zero()) and to revise the scaling of the returned values from the two global integral
routines(Global_area_integral() and global_mass_integral()), and then to use this revised capability to work exclusively with rescaled diagnostics of the surface fluxes.

When global_area_integral() and global_mass_integral() are called with a tmp_scale argument, now it is not just the variable that is being integrated that is returned in rescaled units, but also the area or mass, whereas previously it was only the variable that was returned in rescaled units but the area or mass multiplying them were in mks units of [m2] or [kg]. In other words, these routines now return variables in units of [L2 A ~> m2 a] and [R Z L2 A ~> kg a] (if tmp_scale is present) or [m2 a] and [kg a] (if it is not), but no longer in mixed units of [m2 A ~> m2 a] and [kg A ~> kg a]. As a result the code surrounding the 4 instances where global_area_integral or global_mass_integral were being called with tmp_scale arguments (in MOM.F90 and MOM_ice_shelf.F90) also had to be modified in the same commit.

This same commit also includes a rescaling in the units of the areaT_global and IareaT_global elements of the ocean_grid_type and dyn_horgrid_type to [L2 ~> m2] and [L-2 ~> m-2], respectively. These elements were only used in the area_mean functions, so it is sensible to make this change in the same commit where the area_mean functions were revised. Although the dyn_horgrid_type is shared between MOM6 and SIS2, these elements are not used in SIS2.

The second commit in this PR keeps 36 area integrated surface mass, heat or salt flux diagnostics in forcing_diagnostics() in rescaled units by replacing an unscale argument with a tmp_scale argument to global_area_integral(). It also adds the corresponding conversion arguments to the register_scalar_field() calls for each of these diagnostics, so that the documented units and conversion factors can be used to confirm the correctness of the documented units for these variables.

All answers and diagnostics are bitwise identical, but there are changes in the rescaled units of two elements each in two transparent types, and changes to the rescaling behavior of two publicly visible routines when they are called with tmp_scale arguments. This commits in this PR include:

  • b3ed3f541 Diagnose area integrated fluxes in rescaled units
  • 4e0c742de +Refactor the spatial mean calculations

@Hallberg-NOAA Hallberg-NOAA added the refactor Code cleanup with no changes in functionality or results label Dec 13, 2024
Copy link

@herrwang0 herrwang0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two general comments.

  1. It seems to me that the presences of optional intent(in) arguments scale/unscale and tmp_scale should be mutually exclusive. Currently, we can legitimately have a rescaling factor of [a2 A-2]. This is at best confusing. Both inputs are used in the following subroutines global_area_mean, global_area_integral,
    global_layer_mean, global_volume_mean, global_mass_integral.

  2. Local variable scalefac is defined inconsistently in the aforementioned subroutines. With the entanglement of tmp_scale and unscale/scale, scalefac may end up with four different units. IMO, tmp_scale should not be included in scalefac, in that way tmp_scale is always an explicit argument of reproducing_sum calls.

@@ -252,12 +246,14 @@ function global_layer_mean(var, h, G, GV, scale, tmp_scale, unscale)
tmpForSumming(:,:,:) = 0. ; weight(:,:,:) = 0.

do k=1,nz ; do j=js,je ; do i=is,ie
weight(i,j,k) = (GV%H_to_MKS * h(i,j,k)) * (G%US%L_to_m**2*G%areaT(i,j) * G%mask2dT(i,j))
weight(i,j,k) = (GV%H_to_MKS * h(i,j,k)) * (G%areaT(i,j) * G%mask2dT(i,j))

This comment was marked as resolved.

@@ -55,25 +55,23 @@ function global_area_mean(var, G, scale, tmp_scale, unscale)
! In the following comments, [A] is used to indicate the arbitrary, possibly rescaled units of the
! input array while [a] indicates the unscaled (e.g., mks) units that can be used with the reproducing sums
real, dimension(SZI_(G),SZJ_(G)) :: tmpForSumming ! An unscaled cell integral [a m2]
real :: scalefac ! An overall scaling factor for the areas and variable [a m2 A-1 L-2 ~> 1]
real :: scalefac ! A scaling factor for the variable that is not reversed [a A-1 ~> 1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that if neither unscale nor scale is specified, the units of scalefac should be [nondim], and the unit of tmpForSumming would be either [A L2 ~> a m2] or [a L2 ~> a m2].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. Thank you for this insightful comment. I am correcting the code accordingly.

else
temp_scale = 1.0
scalefac = G%US%L_to_m**2
endif

This comment was marked as outdated.

@@ -300,23 +296,26 @@ function global_volume_mean(var, h, G, GV, scale, tmp_scale, unscale)
is = G%isc ; ie = G%iec ; js = G%jsc ; je = G%jec ; nz = GV%ke

temp_scale = 1.0 ; if (present(tmp_scale)) temp_scale = tmp_scale
scalefac = temp_scale

scalefac = 1.0

This comment was marked as resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point. I am modifying the code to avoid conflating scalefac and temp_scale.

enddo ; enddo ; enddo
else
do k=1,nz ; do j=js,je ; do i=is,ie
tmpForSumming(i,j) = tmpForSumming(i,j) + &
((GV%H_to_kg_m2 * h(i,j,k)) * (scalefac*G%areaT(i,j) * G%mask2dT(i,j)))
((GV%H_to_RZ * h(i,j,k)) * (scalefac*G%areaT(i,j) * G%mask2dT(i,j)))

This comment was marked as resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted. The comments have been modified accordingly.

@Hallberg-NOAA
Copy link
Member Author

As noted, there are two sets of scaling factors that are used in these routines. The temporary scaling factors arguments (tmp_scale) are used exclusively to compensate for the rescaling for dimensional consistency. By contrast, although the net scaling factors (unscale or scale) had previously been used for the dimensional consistency factors in most cases, they are also used to change the units of variables - for example changing temperatures in to heat capacities or salinities into total salt contents. As a result, there are legitimate reasons to use both simultaneously.

Perhaps the documentation of the internal units needs to be cleaned up to describe all the various options, but I am still convinced that the code as revised is functionally correct.

@herrwang0

This comment was marked as resolved.

@Hallberg-NOAA
Copy link
Member Author

Hallberg-NOAA commented Jan 16, 2025

As I am working through the comments on this PR, it is becoming increasingly clear that correctly rescaling the units is going to be very confusing if both the scale and unscale arguments are present in the various global_..._mean() and global_..._integral() routines. I am coming around to the suggestion that there should be an error issued if someone tries to use both simultaneously. I do not believe that we actually have any instances yet where both are used, and one can always do something like a conversion from integrated temperature to heat by multiplication after the fact or via a conversion argument to a register_diag_field() call.

@Hallberg-NOAA Hallberg-NOAA force-pushed the refactor_spatial_means branch from b3ed3f5 to 9bacdb8 Compare January 16, 2025 19:56
@Hallberg-NOAA
Copy link
Member Author

I have now worked out how to very explicitly document all of the units of variables in the various spatial mean routines when both the unscale and tmp_scale arguments are present, and modified the comments in this commit accordingly, which will hopefully address many of your specific comments.

I have also corrected the double-counting bugs that you spotted, @herrwang0 .

I would appreciate it if you could take a look at the revised code and let me know if it addresses all of the points of confusion that you noted, or at least does as much as we can in a situation with so many combinations of options.

@Hallberg-NOAA Hallberg-NOAA force-pushed the refactor_spatial_means branch from 9bacdb8 to 45637b0 Compare January 16, 2025 20:04
Copy link

@herrwang0 herrwang0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the method of using A and B to distinguish tmp_scale and unscale is helpful and makes it much easier to keep track of the units. The double-counting issue is also fixed. So functionality-wise, I think this PR is ready to go.

There are only two very minor nitpicking issues.

  1. In two of the functions, the areal unscaling is included in scalefac, which seems inconsistent with the other functions.
  2. I have examined the units in local variable descriptions and found a few that's not complete or incorrect. In one case, a local variable could be a combination of six unique units, which could make the description quite cumbersome. Admittedly, I don't really have a better solution.

! In the following comments, [A ~> a] is used to indicate the arbitrary, possibly rescaled units of the
! input array while [a] indicates the unscaled (e.g., mks) units that can be used with the reproducing sums.
! [A ~> a] and [B ~> b] are the same units unless tmp_scale and unscale are both present.
real, dimension(SZI_(G),SZJ_(G)) :: tmpForSumming ! An unscaled cell integral [a L2 ~> a m2] or [B L2 ~> b m2]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmpForSumming can also has a unit of [A L2 ~> a m2], when scalefac is [1].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noting this, I have modified this line as suggested.

Comment on lines 186 to 196
real, dimension(SZI_(G),SZJ_(G)) :: tmpForSumming ! An unscaled cell integral in [a m2] or [B L2 ~> b m2]
! or [m2] or [L2 ~> m2] if var is not present.
real :: scalefac ! An overall scaling factor for the areas and variable, in units of [a m2 A-1 L-2 ~> 1]
! or [1] or [B m2 A-1 L-2 ~> b a-1] or [B A-1 ~> b a-1] depending on which
! optional arguments are present.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still a few errors in the units. Note that var is not an optional argument. Below is what I think is correct for crosschecking.

tmp_scale unscale scalefac tmpForSumming
T T [B A-1 ~> b a-1] [B L2 ~> b m2]
T F [1] [A L2 ~> a m2]
F T [a m2 A-1 L-2] [a m2]
F F [m2 L-2] [A m2 ~> a m2]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A version of this table has been added in comments.


if ((temp_scale /= 0.0) .and. (temp_scale /= 1.0)) &
global_area_integral = global_area_integral / temp_scale
global_area_integral = reproducing_sum(tmpForSumming, unscale=temp_scale)

This comment was marked as resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the integrated area is also rescaled from [L2 ~> m2] to [m2] in the returned variable when tmp_scale is not present. This is deliberate and functionally important.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It appears to me the difference is when one is doing an integral or average. I agree the current form is correct.

Comment on lines 248 to 249
real, dimension(G%isc:G%iec,G%jsc:G%jec,SZK_(GV)) :: tmpForSumming ! An unscaled cell integral [L2 a m ~> a m3]
! or [L2 a kg m-2 ~> a kg] or [L2 B m ~> b m3] or [L2 B m ~> b m3]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is extra-complicated. weight has two units (Boussinesq or not) and scalefac has three, so tmpForSumming has six.

Boussinesq weight tmp_scale unscale scalefac tmpForSumming
T [L2 m ~> m3] T T [B A-1 ~> b a-1] [L2 B m ~> b m3]
T [L2 m ~> m3] T F [1] [L2 A m ~> a m3]
T [L2 m ~> m3] F T [a A-1 ~>1] [L2 a m ~> a m3]
T [L2 m ~> m3] F F [1] [L2 A m ~> a m3]
F [L2 kg m-2 ~> kg] T T [B A-1 ~> b a-1] [L2 B kg m-2 ~> b kg]
F [L2 kg m-2 ~> kg] T F [1] [L2 A kg m-2 ~> a kg]
F [L2 kg m-2 ~> kg] F T [a A-1 ~>1] [L2 a kg m-2 ~> a kg]
F [L2 kg m-2 ~> kg] F F [1] [L2 A kg m-2 ~> a kg]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this table so much that I have incorporated it directly into a comment in the code.

Comment on lines +324 to +360
real, dimension(SZI_(G),SZJ_(G)) :: tmpForSumming ! The volume or mass integral of the variable in a column
! [B L2 m ~> b m3] or [B L2 kg m-2 ~> b kg] or
! [L2 a m ~> a m3] or [L2 a kg m-2 ~> a kg]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, there should be six units for tmpForSumming (missing the two when scalefac is [1]).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a table in a comment document all of these various units and when they occur.

Comment on lines 384 to 387
real, dimension(SZI_(G),SZJ_(G)) :: tmpForSumming ! The mass-weighted integral of the variable in a
! column [kg a] or [kg] or [B R Z L2 ~> kg b] or [R Z L2 ~> kg] if tmp_scale is present.
real :: scalefac ! An overall scaling factor for the cell mass and variable [a kg A-1 R-1 Z-1 L-2 ~> 1]
! or [1] or [B A-1 ~> b a-1] if tmp_scale is present.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of scalefac is missing one possible unit (kg R-1 Z-1 L-2).

tmp_scale unscale scalefac tmpForSumming
T T [B A-1 ~> b a-1] [B R Z L2 ~> kg b]
T F [1] [R Z L2 ~> kg]
F T [a kg A-1 R-1 Z-1 L-2 ~> 1] [kg a]
F F [kg R-1 Z-1 L-2 ~>1] [kg]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a version of this table in the comments.

enddo ; enddo ; enddo
endif
global_sum = .true. ; if (present(on_PE_only)) global_sum = .not.on_PE_only
if (global_sum) then
global_mass_integral = reproducing_sum(tmpForSumming)
global_mass_integral = reproducing_sum(tmpForSumming, unscale=temp_scale)

This comment was marked as resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason here is the same, and the code is correct as written.

@Hallberg-NOAA
Copy link
Member Author

The two functions where the areal unscaling is conditionally included in scalefac when tmp_scale is absent are global_area_integral() and global_mass_integral(), in which case it is not just the variable that is being integrated that is being rescaled with the call, but also the area or mass. Although this pattern is inconsistent with the other functions, it is both functionally important and correct.

@Hallberg-NOAA Hallberg-NOAA force-pushed the refactor_spatial_means branch from 45637b0 to d9d0979 Compare January 17, 2025 15:13
@Hallberg-NOAA
Copy link
Member Author

Thank you very much for your very thorough reviews, @herrwang0. I have now updated the comments in the code, several of them using your tables that were directly inspired by your comments. I am confident that the potentially confusing units of the variables in this PR are now as thoroughly documented as could be considered reasonable. I hope that you agree with me that all of your comments have been addressed and that this PR is now ready to go.

Copy link

@herrwang0 herrwang0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found three unit errors in the tables and have made the corrections (the corrected table in the comment sections should be in the right format and can be directly copied over to the code).

Thanks for the patience. I think we are almost there.

! True | True | [B A-1 ~> b a-1] | [B L2 ~> b m2] |
! True | False | [1] | [A L2 ~> a m2] |
! False | True | [a m2 A-1 L-2 ~> b a-1] | [a m2] |
! False | False | [m2 L-2 ~> 1] | [a m2] |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last row has an error in tmpForSumming.

!      False         |      False       | [m2 L-2 ~> 1]           | [A m2 ~> a m2]                      |

Variable is not unscaled outside of reproducing_sum when unscale is not present.

Comment on lines +267 to +287
!__________________________________________________________________________________________________
! Units of weight, scalefac and tmpForSumming, depending on the presence of optional arguments |
!_________________________________________________________________________________________________|
! Boussinesq | tmp_scale | unscale | weight units | scalefac units | tmpForSumming units |
! | present | present | | | |
!____________|___________|_________|___________________|__________________|_______________________!
! True | True | True | [L2 m ~> m3] | [B A-1 ~> b a-1] | [B L2 m ~> b m3] |
! True | True | False | [L2 m ~> m3] | [1] | [A L2 m ~> a m3] |
! True | False | True | [L2 m ~> m3] | [a A-1 ~> 1] | [L2 a m ~> a m3] |
! True | False | False | [L2 m ~> m3] | [1] | [L2 a m ~> a m3] |
! False | True | True | [L2 kg m-2 ~> kg] | [B A-1 ~> b a-1] | [B L2 kg m-2 ~> b kg] |
! False | True | False | [L2 kg m-2 ~> kg] | [1] | [A L2 kg m-2 ~> a kg] |
! False | False | True | [L2 kg m-2 ~> kg] | [a A-1 ~> 1] | [L2 a kg m-2 ~> a kg] |
! False | False | False | [L2 kg m-2 ~> kg] | [1] | [L2 a kg m-2 ~> a kg] |
!____________|___________|_________|___________________|__________________|_______________________!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table below corrects two errors in tmpForSumming. tmpForSumming does not include tmp_scale, so if unscale is not present, the variable intmpForSumming is not unscaled.

  !__________________________________________________________________________________________________
  ! Units of weight, scalefac and tmpForSumming, depending on the presence of optional arguments    |
  !_________________________________________________________________________________________________|
  ! Boussinesq | tmp_scale | unscale |  weight units     | scalefac units   | tmpForSumming units   |
  !            | present   | present |                   |                  |                       |
  !____________|___________|_________|___________________|__________________|_______________________!
  !   True     |  True     |  True   | [L2 m ~> m3]      | [B A-1 ~> b a-1] | [B L2 m ~> b m3]      |
  !   True     |  False    |  True   | [L2 m ~> m3]      | [a A-1 ~> 1]     | [L2 a m ~> a m3]      |
  !   True     |  either   |  False  | [L2 m ~> m3]      | [1]              | [A L2 m ~> a m3]      |
  !   False    |  True     |  True   | [L2 kg m-2 ~> kg] | [B A-1 ~> b a-1] | [B L2 kg m-2 ~> b kg] |
  !   False    |  False    |  True   | [L2 kg m-2 ~> kg] | [a A-1 ~> 1]     | [L2 a kg m-2 ~> a kg] |
  !   False    |  either   |  False  | [L2 kg m-2 ~> kg] | [1]              | [A L2 kg m-2 ~> a kg] |

! True | True | True | [B A-1 ~> b a-1] | [B R Z L2 ~> b kg] |
! True | True | False | [1] | [A R Z L2 ~> a kg] |
! True | False | True | [a kg A-1 R-1 Z-1 L-2 ~> 1] | [a kg] |
! True | False | False | [kg R-1 Z-1 L-2 ~> 1] | [a kg] |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line above should be:

  !  True   |  False    |  False  | [kg R-1 Z-1 L-2 ~> 1]       | [A kg ~> a kg]                 |


if ((temp_scale /= 0.0) .and. (temp_scale /= 1.0)) &
global_area_integral = global_area_integral / temp_scale
global_area_integral = reproducing_sum(tmpForSumming, unscale=temp_scale)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It appears to me the difference is when one is doing an integral or average. I agree the current form is correct.

@Hallberg-NOAA
Copy link
Member Author

Hallberg-NOAA commented Jan 18, 2025

So this is perhaps not well documented in the comments, but without its unscale parameter, reproducing_sum() can only work properly with unscaled variables, so without the unscale or tmp_scale being passed to the routines in MOM_spatial_means.F90, the var argument must already be in arbitrary unscaled units (i.e., [a] and not [A ~> a]). With the last attempt at this PR, I had added [a] as possible units for var on line 163, but it was not explicit enough. In other words, the units in all the tables are correct, but the comments describing the var arguments are not explicit enough.

I have now revised this PR again to clarify this point.

  Refactored the spatial mean calculations in the functions global_area_mean(),
global_area_mean_u(), global_area_mean_v(), global_layer_mean(),
global_volume_mean() and adjust_area_mean_to_zero() to work in rescaled units by
making use of the unscale arguments to the reproducing_sum routines.  Comments
were also added to explain what the code does when both tmp_scale and unscale
arguments are provided, which effectively acts as to allow for changes in the
rescaled units.

  Global_area_integral() and global_mass_integral() were also similarly
refactored, but with the added difference that when a tmp_scale argument is
provided, the areas or masses in the integrals have units of [L2 A ~> m2 a] or
[R Z L2 A ~> kg a] instead of the mixed units of [m2 A ~> m2 a] or [kg A ~> kg
a].  As a result the code surrounding the 4 instances where global_area_integral
or global_mass_integral were being called with tmp_scale arguments (in MOM.F90
and MOM_ice_shelf.F90) also had to be modified in this same commit.  When the
optional var argument is absent from a call to global_mass_integral() so that it
is the mass itself that is returned, the values (if any) of scale, unscale
and tmp_scale are ignored, but the presence of tmp_scale determines whether the
mass is returned in [R Z L2 ~> kg] or [kg], consistent with the behavior that
would have been obtained if var had been an array of nondimensional 1s.

  This commit also includes a rescaling in the units of the areaT_global and
IareaT_global elements of the ocean_grid_type and dyn_horgrid_type to [L2 ~> m2]
and [L-2 ~> m-2], respectively.  Although the dyn_horgrid_type is shared between
MOM6 and SIS2, these elements are not used in SIS2.

  A total of 12 rescaling factors were eliminated or moved into unscale arguments
as a result of these changes.

  All answers are bitwise identical, but there are changes in the rescaled units
of two elements each in two transparent types, and changes to the rescaling
behavior of two publicly visible routines when they are called with tmp_scale
arguments.
  Keep 36 area integrated surface mass, heat or salt flux diagnostics in
forcing_diagnostics in rescaled units by replacing an unscale argument with a
tmp_scale argument to global_area_integral.  Also added the corresponding
conversion arguments to the register_scalar_field calls for each of these
diagnostics, so that the documented units and conversion factors can be used to
confirm the correctness of the documented units for these variables.  The
internal variables used for the integrated or averaged heat, mass or salt fluxes
were also separated for clarity about the units of these variables.  All answers
and diagnostics are bitwise identical.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Code cleanup with no changes in functionality or results
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants