-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround for transient Smallville tests #1673 + testing all new datasets #2318
Workaround for transient Smallville tests #1673 + testing all new datasets #2318
Conversation
This generates files for dynLakes, dynUrban, and dynPft tests.
I marked this PR as "blocker" to #1903. |
@@ -1314,7 +1314,7 @@ lnd/clm2/surfdata_esmf/ctsm5.2.0/surfdata_10x15_hist_78pfts_CMIP6_1850_c230517.n | |||
<fsurdat hgrid="4x5" sim_year="1850" use_crop=".true." > | |||
lnd/clm2/surfdata_esmf/ctsm5.2.0/surfdata_4x5_hist_78pfts_CMIP6_1850_c230517.nc</fsurdat> | |||
<fsurdat hgrid="1x1_smallvilleIA" sim_year="1850" > | |||
/glade/work/erik/ctsm_worktrees/ctsm5.2.mksurfdata/tools/mksurfdata_esmf/surfdata_1x1_smallvilleIA_hist_78pfts_CMIP6_1850_c230726.nc | |||
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/surfdata_1x1_smallvilleIA_hist_78pfts_CMIP6_1850-2015_c240103.nc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this one. I left the equivalent 2000 file unchanged for now, because this PR does not touch it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and we will be renaming these files before they are final.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From review with @ekluzek
Remove /glade/campaign/.../inputdata/ here and possibly elsewhere.
@@ -1426,7 +1426,7 @@ use_crop=".true.">lnd/clm2/surfdata_map/ctsm5.1.dev052/landuse.timeseries_mpasa1 | |||
<!-- Note that this transient file only goes up to year 1855. | |||
This is sufficient for testing purposes, but could cause problems if you try to run beyond 1855. --> | |||
<flanduse_timeseries hgrid="1x1_smallvilleIA" sim_year_range="1850-2000" | |||
use_crop=".true." >lnd/clm2/surfdata_map/landuse.timeseries_1x1_smallvilleIA_hist_78pfts_simyr1850-1855_c160127.nc</flanduse_timeseries> | |||
use_crop=".true." >lnd/clm2/surfdata_esmf/ctsm5.2.0/landuse.timeseries_1x1_smallvilleIA_hist_78_CMIP6_1850-1855_dynPft_c240103.nc</flanduse_timeseries> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was the correct place to put the new dynPft file, though I'm not 100% sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the right place. We just want to put it alongside the other standard files. No need for it to be in some non-standard place.
Tested: modify_smallville.sh Not tested: Makefile
This comment was marked as resolved.
This comment was marked as resolved.
if (.not. readvar) call endrun( msg=' ERROR: PCT_OCEAN NOT on surfdata file'//errMsg(sourcefile, __LINE__)) | ||
if (.not. readvar) call endrun( msg= & | ||
' ERROR: PCT_OCEAN NOT on surfdata file but required when running ctsm5.2 or newer; ' // & | ||
' you are advised to generate a new surfdata file using the mksurfdata_esmf tool ' // errMsg(sourcefile, __LINE__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, this edit fixes #2218, which also needs to come in sooner rather than later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Open new issue about fsurdat versioning and put in IF TIME ALLOWS and add "next"
@@ -46,6 +46,7 @@ SUBSETDATA_1X1_SMALL := --lat 40.6878 --lon 267.0228 --site 1x1_smallvilleIA \ | |||
--pctpft 6.5 1.5 1.6 1.7 1.8 1.9 1.5 1.6 1.7 1.8 1.9 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 | |||
# NOTE: The 1850 smallvilleIA site is constructed to start with 100% natural vegetation, so we can test transition to crops | |||
SUBSETDATA_1X1_SMALL1850 := --lat 40.6878 --lon 267.0228 --site 1x1_smallvilleIA --dompft 13 --pctpft 100 | |||
SUBSETDATA_1X1_SMALLTRANSIENT := --lat 40.6878 --lon 267.0228 --site 1x1_smallvilleIA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ekluzek's would-be-nice suggestion:
Consolidate SUBSETDATA_1X1_SMALL
then separate 1850 and TRANSIENT later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting the Makefile working has likely resulted in additional clean-up issues, so I'm leaving this unresolved in favor of a more general clean-up in the Makefile.
@ekluzek agreed with me on this. I will remove the empty checkbox to avoid confusion.
tools/mksurfdata_esmf/Makefile
Outdated
# modify_smallville.sh to generate three modified landuse.timeseries files needed for testing. | ||
crop-smallville-transient : FORCE | ||
$(SUBSETDATA_POINT) --create-landuse $(SUBSETDATA_1X1_SMALLTRANSIENT) | ||
../modify_input_files/modify_smallville.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 214 needs tab(s) like the line above it.
@slevis-lmwg is this ready to come into the ctsm5.2 branch? From our discussion you need to get the makefile working and test it all out, but it's mostly ready. |
Yes, mostly ready. Two things:
When I have updated the Makefile, look at #2201 for guidance on building mksurdata_esmf. |
@slevis-lmwg I merged the dev163 update to the ctsm5.2 branch, so you should update to it when you are ready. The tag was: alpha-ctsm5.2.mksrf.19_ctsm5.1.dev163 |
TODO Run the NEON tests from the test-suites to confirm that updates work from the last couple of commits. izumi PASS |
End of day tests on derecho (where things seems stuck in the queue right now):
|
@@ -2346,20 +2346,6 @@ | |||
<option name="comment" >Add at least one test of a FATES NEON site with PRISM precipitation</option> | |||
</options> | |||
</test> | |||
<test name="SMS_D_Lm1_Mmpi-serial" grid="CLM_USRDAT" compset="I1PtClm50SpRs" testmods="clm/USUMB_nuopc"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slevis-lmwg I'd like you to reconsider this. The purpose of this one is to run a generic tower site and show that we can continue to do it. And that is a capability that we want to continue moving forward. NEON is highly curated so it doesn't provide a good example, and PLUMBER2 will be similar. But, we still want to show that someone can set something up for any random tower site.
Now, as I think about this, this site is setup to demonstrate PTCLM which we've removed. So it should be redone to use subset data and be based on it. That would be a good demonstration for this and for our current workflow.
I'll create an issue for this, and we can decide how this should happen. It wouldn't be hard to add to CTSM5.2, but it could also happen after CTSM5.2...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussing how this should happen sounds good to me @ekluzek.
I'm encountering some other errors that I probably need your help with. I will start with these two:
- The next two fail in MODEL_BUILD as "intel" but I expect them to work as "gnu" based on a third test that behaved the same way:
ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm51SpRs.derecho_intel.clm-CLM1PTStartDate
SMS_D_Mmpi-serial_Ld5.5x5_amazon.I2000Clm51FatesRs.derecho_intel.clm-FatesCold
The error says:
ld: /opt/cray/pe/mpich/8.1.25/ofi/intel/19.0/lib/libmpi_intel.so.12: undefined reference to `fi_strerror@FABRIC_1.0'
ld: /opt/cray/pe/mpich/8.1.25/ofi/intel/19.0/lib/libmpi_intel.so.12: undefined reference to `fi_fabric@FABRIC_1.1'
ld: /opt/cray/pe/mpich/8.1.25/ofi/intel/19.0/lib/libmpi_intel.so.12: undefined reference to `fi_getinfo@FABRIC_1.3'
ld: /opt/cray/pe/mpich/8.1.25/ofi/intel/19.0/lib/libmpi_intel.so.12: undefined reference to `fi_dupinfo@FABRIC_1.3'
ld: /opt/cray/pe/mpich/8.1.25/ofi/intel/19.0/lib/libmpi_intel.so.12: undefined reference to `fi_freeinfo@FABRIC_1.3'
ld: /opt/cray/pe/mpich/8.1.25/ofi/intel/19.0/lib/libmpi_intel.so.12: undefined reference to `fi_version@FABRIC_1.0'
In the first test, I see "fi" in
/glade/work/slevis/git/mksurfdata_toolchain/cime_config/testdefs/testmods_dirs/clm/CLM1PTStartDate/shell_commands
Is that the "fi" that it's complaining about?
- Then we have
FAIL SMS_Ld12_Mmpi-serial.1x1_urbanc_alpha.I1PtClm51SpRs.derecho_intel.clm-output_sp_highfreq RUN
failing with this error:
Abort with message NetCDF: Variable not found in file /glade/derecho/scratch/csgteam/temp/spack/derecho/23.09/builds/spack-stage-parallelio-2.6.2-bpi7h2bnkshocep4hl3drfik4ebj44iu/spack-src/src/clib/pio_nc.c at line 1164
The lnd.log ends with
(shr_ndep_readnl) Read in ndep_inparm namelist from: drv_flds_in
LND: PIO numiotasks= 1
LND: PIO stride= 1
LND: PIO rearranger= 2
LND: PIO root= 1
The atm.log ends with
(shr_strdata_readstrm) reading file lb: /glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0001-08.nc 23
UPDATE: I'm marking this as an EXPECTED FAILURE because datm.streams.xml lists
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0001-08.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0001-09.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0001-10.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0001-11.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0001-12.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-01.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-02.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-03.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-04.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-05.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-06.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-07.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-08.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-09.nc</file>
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-10.nc</file
<file>/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/CLM1PT_data/urbanc_alpha.c080416/clm1pt-0002-11.nc</file>
while the corresponding directory (in /glade/campaign, as well as in /glade/p) contains only the first file and the last file and none of the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
SMS_Ld5.f09_g17.ISSP460Clm50BgcCrop.derecho_intel.clm-ciso_dec2050Start
This one complainsERROR: No stream_entry presaero.SSP4-6.0 found'
I did not come across a relevant issue, bug, or PR in https://github.com/ESCOMP/CDEPS
I'm removing ssp4 tests, as recommended in the next post.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slevis-lmwg the first one is due to: ESMCI/ccs_config_cesm#130 So you just need to change it non-DEBUG or to gnu.
The second one (2) we need to see what file it's failing on. That might be in one of the log files, or you might see at least what file was last opening in the log files. Then we need to figure out what variable it is that it's failing to find...
On the third one (3) this means there likely isn't Prescribed aerosol files for SSP4-6.0. So we probably just shoudn't do any SSP4-6.0 tests. I would look into CDEPS to make sure this is a bug there though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize I'm coming to this kind of late, but regarding the single point test that @slevis-lmwg suggested removing the USUMB test created from PTCLM. This does seem like something we should remove?
To avoid mission creep, I suggest we postpone testing for particular single point features to a PR that involves the development of single point features (NEON, PLUMBER2, or others), but defer to @ekluzek and @slevis-lmwg, as you're more deeply involved here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, if it's experimental, yes we should remove it from our testing, and label it as expected failure. We shouldn't worry about testing it, until they can show us what to do. And I don't want to have a test that requires a 100 nodes of Derecho as just getting through the queue to run takes too long and is expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's dying in malloc, so it's definitely a memory issue. So it's going to take some work to figure it out. It's at initialization so it might be initialization needs more memory than at run-time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slevis-lmwg we should record this as an issue, so that people working on this can reference. And we'll be reminded of it being a problem. Could you start an issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes. TODO open issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[...]
Next, I think found a bug in /ccs_config/component_grids_nuopc.xml. I had to change this line- <domain name="mpasa15-3conus"> + <domain name="mpasa15-3">
for the case to find the grid's mesh file. I'm making the following TODO checkbox:
- Open another issue in CDEPS or does this go somewhere else?
@ekluzek I will go ahead with this merge, as we discussed, but I wanted to bring attention to the above comment that has two aspects:
- Open an issue?
- Logistically what are the steps to resolve it?
Troubleshooting the LWISO failure here: UPDATE 1: I seem to have solved the error for ice1_grc. Now I get the same error with qflx_ice_dynbal_grc, which I later found is due to ice2_grc.
|
I found no evidence of this test having been run before when serching in /glade/campaign/cgd/tss/ctsm_baselines/ The test's datm.streams.xml lists 16 files of which only two exist in the corresponding datm directory (whether I look in /glade/campaign or /glade/p), so this test possibly never worked.
LWISO failure UPDATE
Note:
|
not sure where I'm supposed to reply, but if the crazy high resolution surface dataset doesn't work (and Brian can't make it work) I'd leave this as an expected failure and let someone who actually wants this dataset work on producing it. It doesn't need to be our problem right now. Was that you're question @slevis-lmwg? |
Thanks for your work on this @slevis-lmwg ! I'll comment in #2366 . |
alpha-ctsm5.2.mksrf.20_ctsm5.1.dev163
Script for generating landuse.timeseries and fsurdat files for Smallville dynLakes, dynUrban, and dynPft tests.
Specific notes
Contributors other than yourself, if any:
@ekluzek
CTSM Issues Fixed (include github issue #):
Partly addresses, does not fix #1673
Fixes #2218, though unrelated
Fixes #2319
Are answers expected to change (and if so in what way)?
Not relevant.
Any User Interface Changes (namelist or namelist defaults changes)?
Testing performed, if any:
Tests pointing to the new landuse.timeseries and fsurdat files passed (details posted in #1673)