Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cam6_4_043: Make RRTMGP default radiation in CAM7 #1178

Merged
merged 12 commits into from
Oct 25, 2024
5 changes: 4 additions & 1 deletion bld/configure
Original file line number Diff line number Diff line change
Expand Up @@ -1074,9 +1074,12 @@ my $rad_pkg = 'none';
if ($phys_pkg =~ m/cam4|spcam_sam1mom/) {
$rad_pkg = 'camrt';
}
elsif ($phys_pkg =~ m/cam5|cam6|cam7|spcam_m2005/) {
elsif ($phys_pkg =~ m/cam5|cam6|spcam_m2005/) {
$rad_pkg = 'rrtmg';
}
elsif ($phys_pkg =~ m/cam7/) {
$rad_pkg = 'rrtmgp';
}
# Allow the user to override the default via the commandline.
my $use_rrtmgp_gpu = 0;
if (defined $opts{'rad'}) {
Expand Down
2 changes: 1 addition & 1 deletion bld/namelist_files/namelist_defaults_cam.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2548,7 +2548,7 @@
<seasalt_emis_scale ver="mam7" >1.62D0</seasalt_emis_scale>
<seasalt_emis_scale ver="strat" >0.90D0</seasalt_emis_scale>
<seasalt_emis_scale ver="strat" clubb_sgs="1" >1.00D0</seasalt_emis_scale>
<seasalt_emis_scale ver="strat" clubb_sgs="1" phys="cam7" >1.5D0</seasalt_emis_scale>
<seasalt_emis_scale ver="strat" clubb_sgs="1" phys="cam7" >0.75D0</seasalt_emis_scale>
peverwhee marked this conversation as resolved.
Show resolved Hide resolved
<seasalt_emis_scale ver="strat" clubb_sgs="1" hgrid="1.9x2.5" phys="cam6">1.10D0</seasalt_emis_scale>
<seasalt_emis_scale ver="strat" spcam_clubb_sgs="1" >1.2D0</seasalt_emis_scale>
<seasalt_emis_scale ver="strat" clubb_sgs="1" silhs="1" >0.60D0</seasalt_emis_scale>
Expand Down
2 changes: 1 addition & 1 deletion bld/namelist_files/use_cases/1850_cam_lt.xml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

<!-- Low top upper boundary conditions -->
<ubc_specifier>'Q:H2O->UBC_FILE'</ubc_specifier>
<ubc_file_path>atm/cam/chem/ubc/b.e21.BWHIST.f09_g17.CMIP6-historical-WACCM.ensAvg123.cam.h0zm.H2O.185001-201412_c230509cdf5.nc</ubc_file_path>
<ubc_file_path>atm/cam/chem/ubc/b.e21.BWHIST.f09_g17.CMIP6-historical-WACCM.ensAvg123.cam.h0zm.H2O.1849-2014_c240604.nc</ubc_file_path>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file needs to be added to the svn inputdata repo. Please confirm that it has the mandatory metadata as described at: https://www2.cesm.ucar.edu/working_groups/Atmosphere/amwg_datasets.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated ubc_file_path file has the required metadata and has been added to the svn inputdata repo.

<ubc_file_input_type>CYCLICAL</ubc_file_input_type>
<ubc_file_cycle_yr>1850</ubc_file_cycle_yr>

Expand Down
46 changes: 13 additions & 33 deletions cime_config/testdefs/testlist_cam.xml
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,6 @@
<option name="wallclock">00:10:00</option>
</options>
</test>
<test compset="F2000climo" grid="f10_f10_mg37" name="SMS_Lm13" testmods="cam/outfrq1m" supported="false">
<machines>
<machine name="derecho" compiler="intel" category="aux_cam"/>
</machines>
<options>
<option name="wallclock">03:00:00</option>
</options>
</test>
<test compset="F2000climo" grid="f09_f09_mg17" name="PFS">
<machines>
<machine name="derecho" compiler="gnu" category="prebeta"/>
Expand Down Expand Up @@ -940,9 +932,16 @@
<option name="wallclock">00:20:00</option>
</options>
</test>
<test compset="QPWmaC6" grid="f45_f45_mg37" name="ERP_Ln9_P24x2" testmods="cam/outfrq9s_mee_fluxes" supported="false">
<machines>
<machine name="izumi" compiler="gnu" category="aux_cam"/>
</machines>
<options>
<option name="wallclock">00:20:00</option>
</options>
</test>
<test compset="QPWmaC6" grid="f45_f45_mg37" name="ERP_Ln9_P24x3" testmods="cam/outfrq9s_mee_fluxes" supported="false">
<machines>
<machine name="derecho" compiler="intel" category="aux_cam"/>
<machine name="derecho" compiler="intel" category="waccm"/>
</machines>
<options>
Expand Down Expand Up @@ -1378,9 +1377,9 @@
</options>
</test>

<test compset="F2000climo" grid="mpasa480_mpasa480" name="ERS_Ln9_P36x1" testmods="cam/outfrq9s_mpasa480">
<test compset="F2000climo" grid="mpasa480_mpasa480" name="ERS_Ln9_P24x1" testmods="cam/outfrq9s_mpasa480">
<machines>
<machine name="derecho" compiler="intel" category="aux_cam"/>
<machine name="izumi" compiler="gnu" category="aux_cam"/>
</machines>
<options>
<option name="wallclock">00:45:00</option>
Expand Down Expand Up @@ -1771,13 +1770,13 @@
<option name="comment">CAM7 low top ~40 km</option>
</options>
</test>
<test compset="FLTHIST" grid="ne30pg3_ne30pg3_mg17" name="ERP_D_Ln9" testmods="cam/outfrq9s_rrtmgp">
<test compset="FLTHIST" grid="ne3pg3_ne3pg3_mg37" name="ERP_D_Ln9" testmods="cam/outfrq9s">
<machines>
<machine name="derecho" compiler="intel" category="aux_cam"/>
<machine name="izumi" compiler="gnu" category="aux_cam"/>
</machines>
<options>
<option name="wallclock">00:30:00</option>
<option name="comment">CAM7 low top ~40 km w/ RRTMGP</option>
<option name="comment">CAM7 low top ~40 km</option>
</options>
</test>
<test compset="FMTHIST" grid="ne30pg3_ne30pg3_mg17" name="SMS_D_Ln9" testmods="cam/outfrq9s">
Expand All @@ -1789,15 +1788,6 @@
<option name="comment">CAM7 mid top ~80 km</option>
</options>
</test>
<test compset="FMTHIST" grid="ne30pg3_ne30pg3_mg17" name="SMS_D_Ln9" testmods="cam/outfrq9s_rrtmgp">
<machines>
<machine name="derecho" compiler="intel" category="prealpha"/>
</machines>
<options>
<option name="wallclock">00:20:00</option>
<option name="comment">CAM7 mid top ~80 km w/ RRTMGP</option>
</options>
</test>
<test compset="FMTHIST" grid="ne30pg3_ne30pg3_mg17" name="ERP_D_Ln9" testmods="cam/outfrq9s">
<machines>
<machine name="derecho" compiler="intel" category="prealpha"/>
Expand Down Expand Up @@ -2758,16 +2748,6 @@
<option name="wallclock">00:10:00</option>
</options>
</test>
<test compset="F1850" grid="f10_f10_mg37" name="ERS_Ld3" testmods="cam/outfrq1d_14dec_ghg_cam7">
<machines>
<machine name="derecho" compiler="intel" category="aux_cam"/>
<machine name="derecho" compiler="intel" category="ghg_cam"/>
</machines>
<options>
<option name="wallclock">00:10:00</option>
<option name="comment">Checks that exact restarts occur when crossing the December 16th date boundary with WACCM-SC chemistry in low top configuration</option>
</options>
</test>
<test compset="FWsc2000climo" grid="f10_f10_mg37" name="ERP_Ld3" testmods="cam/outfrq1d_14dec">
<machines>
<machine name="derecho" compiler="intel" category="waccm"/>
Expand Down

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
./xmlchange ROF_NCPL=\$ATM_NCPL
./xmlchange GLC_NCPL=\$ATM_NCPL
./xmlchange RUN_STARTDATE="1999-12-31"
./xmlchange START_TOD="82800"
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
mfilt=1,1,1,1,1,1
ndens=1,1,1,1,1,1
nhtfrq=9,9,9,9,9,9
write_nstep0=.true.
inithist='ENDOFRUN'
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
! Set maxpatch_glcmec with GLC_NEC option
! Set glc_do_dynglacier with GLC_TWO_WAY_COUPLING env variable
!----------------------------------------------------------------------------------
hist_nhtfrq = -24
hist_nhtfrq = 9
hist_mfilt = 1
hist_ndens = 1

Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
./xmlchange ROOTPE='0'
./xmlchange ROF_NCPL=`./xmlquery --value ATM_NCPL`
./xmlchange GLC_NCPL=`./xmlquery --value ATM_NCPL`
./xmlchange CAM_CONFIG_OPTS=' -microphys mg3' --append
./xmlchange CAM_CONFIG_OPTS=' -microphys mg3 -rad rrtmg' --append
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-eaton Is there a reason not to use rrtmgp here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nvhpc test that uses these mods fails when radiation is set to rrtmgp, or rrtmgp_gpu. I tried both. The error looks like this:

deg0062.hsn.de.hpc.ucar.edu 110: Failing in Thread:1
deg0062.hsn.de.hpc.ucar.edu 110: Accelerator Fatal Error: call to cuStreamSynchronize returned error 700 (CUDA_ERROR_ILLEGAL_ADDRESS): Illegal address during kernel execution
deg0062.hsn.de.hpc.ucar.edu 110:  File: /glade/derecho/scratch/eaton/test-src/cam6_4_041_cam7/src/physics/rrtmgp/ext/rte-kernels/accel/mo_rte_solver_kernels.F90
deg0062.hsn.de.hpc.ucar.edu 110:  Function: sw_solver_2stream:573
deg0062.hsn.de.hpc.ucar.edu 110:  Line: 623

Since the purpose of this PR is just to change the default to rrtmgp for the LT and MT configurations I didn't take the time to chase down the problem with this test, but instead left it using rrtmg as it was already doing. I can update the test in a future PR, or in this one if you know what the problem is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Brian for your update. I see where the problem is. This folder is used by the GPU regression test and the EarthWorks team has identified that there is a problem to run CAM with the default RRTMGP kernels on the GPU. In order to run it correctly, we either need to update the CAM->RRTMGP interface (like this EarthWorksOrg#25) or update the RRTMGP kernel (like this EarthWorksOrg/rte-rrtmgp@ac0f76e). Otherwise, the rrtmgp_gpu option is not expected to work properly here. I am surprised that the rrtmgp option is not working either as it should only turn on the RRTMGP CPU code. But the GPU tests may turn on some ACC directives accidentally. Do you have an error message for the rrtmgp option? The error message posted here seems coming from the rrtmgp_gpu option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jian. I got a similar error message with the rrtmgp option (which in this PR is the default radiation for cam7). The test that's failing, and a sample of the error output:

ERS_Ln9_G4-a100-openacc.ne30pg3_ne30pg3_mg17.F2000dev.derecho_nvhpc.cam-outfrq9s_mg3_default

deg0011.hsn.de.hpc.ucar.edu 61: Failing in Thread:1
deg0011.hsn.de.hpc.ucar.edu 61: Accelerator Fatal Error: call to cuStreamSynchronize returned error 700 (CUDA_ERROR_ILLEGAL_ADDRESS): Illegal address during kernel execution
deg0011.hsn.de.hpc.ucar.edu 61:  File: /glade/derecho/scratch/eaton/test-src/cam6_4_041_cam7/src/physics/rrtmgp/ext/rte-frontend/mo_rte_lw.F90
deg0011.hsn.de.hpc.ucar.edu 61:  Function: rte_lw:70
deg0011.hsn.de.hpc.ucar.edu 61:  Line: 312

I'm running regression tests now for this PR and plan to leave this test as it has been using rrtmg. It would be good if you could open an issue if you want to get this test running with the rrtmgp_gpu option.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Brian. The file you pointed to also has ACC directives and it means even using the rrtmgp option, some code will still be enabled on the GPU by the GPU regression test. I guess it won't work properly if only part of the RRTMGP GPU code is activated. I agree that we can just use rrtmp option here and address this problem in a separate issue. I just want to understand what is going on here. Thanks for your help and clarification.

./xmlchange TIMER_DETAIL='6'
./xmlchange TIMER_LEVEL='999'
103 changes: 103 additions & 0 deletions doc/ChangeLog
Original file line number Diff line number Diff line change
@@ -1,5 +1,108 @@
===============================================================

Tag name:
Originator(s): eaton
Date:
One-line Summary: Make RRTMGP default radiation in CAM7
Github PR URL: https://github.com/ESCOMP/CAM/pull/1178

Purpose of changes (include the issue number and title text for each relevant GitHub issue):

. Issue #1143 - turn RRTMGP on by default for CAM7 + some namelist defaults

. Remove some tests that added rrtmgp to the cam7 configuration. Not
needed since rrtmgp is now the default in cam7.

. Remove test of old cam7 development configuration (32 levels) which is no
longer needed.

. Remove 13 month F2000climo test. This was originally created to make
sure we didn't make changes that hurt the performance of our production
configuration for CMIP6 simulations. This is no longer needed.

. Issue #1154 - Create at least one CAM7 regression test on izumi
- add ERP_D_Ln9.ne3pg3_ne3pg3_mg37.FLTHIST.izumi_gnu.cam-outfrq9s_eoy


Describe any changes made to build system: none

Describe any changes made to the namelist:

. change default value of seasalt_emis_scale to 0.75 for cam7 (both lt and mt)

. update ubc_file_path for cam7 (lt only) to
atm/cam/chem/ubc/b.e21.BWHIST.f09_g17.CMIP6-historical-WACCM.ensAvg123.cam.h0zm.H2O.1849-2014_c240604.nc

List any changes to the defaults for the boundary datasets:

Describe any substantial timing or memory changes:

Code reviewed by:

List all files eliminated:

cime_config/testdefs/testmods_dirs/cam/outfrq1d_14dec_ghg_cam7/*
. test removed

List all files added and what they do:

cime_config/testdefs/testmods_dirs/cam/outfrq9s_eoy/*
. new test like outfrq9s, but add RUN_STARTDATE="1999-12-31" and
START_TOD="82800" so that the run goes over the end of year boundary.

List all existing files that have been modified, and describe the changes:

bld/configure
. set default radiation package for cam7 to rrtmgp

bld/namelist_files/namelist_defaults_cam.xml
. change default value of seasalt_emis_scale to 0.75 for cam7 (both lt and mt)

bld/namelist_files/use_cases/1850_cam_lt.xml
. update ubc_file_path to
atm/cam/chem/ubc/b.e21.BWHIST.f09_g17.CMIP6-historical-WACCM.ensAvg123.cam.h0zm.H2O.1849-2014_c240604.nc

cime_config/testdefs/testlist_cam.xml
. These tests which added rrtmgp to cam7 are no longer needed.
ERP_D_Ln9.ne30pg3_ne30pg3_mg17.FLTHIST.*_*.cam-outfrq9s_rrtmgp
SMS_D_Ln9.ne30pg3_ne30pg3_mg17.FMTHIST.*_*.cam-outfrq9s_rrtmgp
. Remove old cam7 configuration test which is no longer needed
ERS_Ld3.f10_f10_mg37.F1850.izumi_gnu.cam-outfrq1d_14dec_ghg_cam7
. Move these low resolution tests from derecho to izumi
ERP_Ln9_P24x2.f45_f45_mg37.QPWmaC6.izumi_gnu.cam-outfrq9s_mee_fluxes
ERS_Ln9_P24x1.mpasa480_mpasa480.F2000climo.izumi_gnu.cam-outfrq9s_mpasa480
. Add this low resolution CAM7-LT test to izumi
ERP_D_Ln9.ne3pg3_ne3pg3_mg37.FLTHIST.izumi_gnu.cam-outfrq9s_eoy
. Remove 13 month cam6 test which is no longer needed.
SMS_Lm13.f10_f10_mg37.F2000climo.derecho_intel.cam-outfrq1m

cime_config/testdefs/testmods_dirs/cam/outfrq9s_mg3_default/shell_commands
. add '-rad rrtmg' to CAM_CONFIG_OPTS. This test is using a non-standard
configuration of cam7, and this override is needed since the default
radiation scheme for cam7 has changed to rrtmgp.


If there were any failures reported from running test_driver.sh on any test
platform, and checkin with these failures has been OK'd by the gatekeeper,
then copy the lines from the td.*.status files for the failed tests to the
appropriate machine below. All failed tests must be justified.

derecho/intel/aux_cam:

derecho/nvhpc/aux_cam:

izumi/nag/aux_cam:

izumi/gnu/aux_cam:

CAM tag used for the baseline comparison tests if different than previous
tag:

Summarize any changes to answers: BFB except for cam7 configurations.

===============================================================
===============================================================

Tag name: cam6_4_042
Originator(s): pel, nusbaume
Date: Oct 9, 2024
Expand Down
Loading