Workaround for transient Smallville tests #1673 + testing all new datasets #2318
Task list completed / task-list-completed
2024-02-16 00:06:42
26 / 28 tasks completed
2 tasks still to be completed
Required Tasks
Task | Status |
I want to test the Makefile with this PR's changes. Run make all-subset for this. |
Completed |
I would like to update the Makefile with all the grids that we need to generate. I might as well do that now because this is the last PR before I will generate all the datasets. Let me know if you disagree. | Completed |
Working in scratch to avoid running out of space: /glade/derecho/scratch/slevis/temp_work/ctsm5.2.mksurfdata/tools/mksurfdata_esmf | Incomplete |
4 jobs still in the queue | Incomplete |
On Monday look at which datasets worked and which didn't | Incomplete |
f10 hist now runs out of wallclock | Incomplete |
mpasa120, C96, f09, f19, hcru, ne16, ne30, ne3 SSPs now run out of wallclock | Incomplete |
Looks as though I have not generated explicit 1850-2015 landuse files except for f10, f45, ne3 (i.e. low-res). Although 1850-2015 is included in the SSP files that I generated, I think we decided to generate separate 1850-2015 files, so as to identify them easily. I will go back and do that. DONE, but we have now concluded that SSP2-4.5 will suffice. | Completed |
ne30 fsurdat files (1850, 2000) did not get generated, while landuse files were generated for 2 ne30 grids that we do not need. I will resolve the former and will not worry about the latter. | Completed | + numaIA + smallville should be renamed 1850. | Completed |
Related to the last point: Brazil and Numa should have been 2000 and Brazil also needs historical. Correct how files are named. Generate the files. | Completed |
1x1_numa needs 1850-2015 landuse file. DONE but turns out didn't need it. | Completed |
Change potveg file to And landuse files 78 to 78pfts. Correct how they are named. |
Completed |
Files in /python/ctsm/test/testinputs may not need updating, unless we want to rename... | Completed |
fsurdat DONE | Incomplete |
landuse DONE | Incomplete |
1st and 2nd attempts, results below | Incomplete |
Many 1x1 cases failed due to the wrong date stamp in namelist_defaults. I have updated this, tested one of the cases, and it worked. | Incomplete |
I am concerned about a large number of failures with this error: | Incomplete |
The error is the same even when I use COLDSTART, so nothing to do with init_interp. | Incomplete |
Look at the fsurdat file with ncview, I am not spotting a problem. | Incomplete |
I am thinking of ways to converge on the problem using git diff strategically, hoping to avoid time consuming debugging... |
Incomplete |
UPDATE: Solved by setting convert_ocean_to_land = .true. , and I think I need to make this the default option for tests to pass in general. |
Incomplete |
Four grids not tested: f10, ne16pg3, mpasa60-3conus, mpasa60-3centralUS. I added the first two. The third and fourth are not supported in ccs_config, yet. | Incomplete |
f09 appears in the ctsm_sci test-suite in 23 tests and f19 appears in the ctsm_sci test-suite in 19 tests. Is this necessary or could we reduce to just a couple the way we do for other grids? | Incomplete |
Two cases fail in the build the same way (another like this appears in aux_clm below). I do not recognize the problem. See discussion with Erik in posts below. | Completed |
RXCROPMATURITY_Lm61.f09_g17.IHistClm50BgcCrop.derecho_intel.clm-cropMonthOutput |
seems broken, because it modifies an fsurdat but then the model gives an error with it. Sam R. opened issue #2357, and I am running with the fix. | Completed |
SMS_Ld12_Mmpi-serial.1x1_urbanc_alpha.I1PtClm51SpRs.derecho_intel.clm-output_sp_highfreq |
NetCDF: Variable not found in file /glade/derecho/scratch/csgteam/temp/spack/derecho/23.09/builds/spack-stage-parallelio-2.6.2-bpi7h2bnkshocep4hl3drfik4ebj44iu/spack-src/src/clib/pio_nc.c at line 1164 |
See discussion with Erik in posts below. | Completed |
SMS_Ld5.f09_g17.ISSP460Clm50BgcCrop.derecho_intel.clm-ciso_dec2050Start |
ERROR: No stream_entry presaero.SSP4-6.0 found See discussion with Erik in posts below. |
Completed |
Solved this one as the smallville cases discussed below: | |
SMS_Ly5_Mmpi-serial.1x1_smallvilleIA.IHistClm51BgcCropQianRs.derecho_intel.clm-gregorian_cropMonthOutput |
Completed |
Set use_init_interp = .true. and rerun within test-suite. Tested one of them and it passed. The rest will get tested when I rerun the full test-suite: |
Completed |
These two tests passed with use_init_interp = .true. . When I get to the finidat task, I will try again without the change but pointing to the generated finidat files. |
Completed |
Submitted these two from the test-suite pointing to newly generated finidat files: | Completed |
Submitted these manually (ie outside the test-suite). The first one gets the wrong PE layout from P144x1 so changed to P64x1: | Completed |
Rerunning within test-suite: | Completed |
Looks similar to something I fixed in #2053. Rerunning within the test-suite with debugging write statements: | |
PEND LWISO_Ld10.f10_f10_mg37.I2000Clm50BgcCrop.derecho_gnu.clm-coldStart.GC.0204-153040de_gnu gives this error: |
Completed |
The smallville dynLakes + dynUrban cases PASS with the recommendation from the error msg. I didn't confirm the rest of what's recommended: | Completed |
Two are I2000 cases that do not have fsurdat files; they work as I1850, which also makes them consistent with the spreadsheet. | Incomplete |
Five are IHist cases; they work with check_dynpft_consistency = .false. , as I mentioned above for derecho. Tried them again without this change and pointing to the newly generated finidats, which reverted to the same error asking me to set check_dynpft_consistency = .false. . |
Incomplete |
submitted aux_clm | Incomplete |
submitted ctsm_sci | Incomplete |
submitted three standalone tests marked as PEND in the long earlier post above. | Incomplete |
FAIL: Skip the endrun for ice1_grc and ice2_grc in the hopes of finding which subgrid term is the culprit. I tried that and found that the simulation actually completes successfully in that case. So on to the next ideas. | Incomplete |
FAIL: This second idea did not come up when I talked to Bill: Remove the "call truncate_small_values" for snocan in case that fixed a fluke error before, and we happen to not need that fix anymore. Bill is skeptical about the longevity of this outcome, even if it pans out. | Incomplete |
IN PROGRESS: Look at the terms making up ice_mass in TotalWaterAndHeatMod.F90's subroutines ComputeLiqIceMassNonLake and ComputeLiqIceMassLake to find the culprit. In doing so, I have found a source of negative ice_mass, likely in snocan. Depending my findings, I may advocate for eliminating negative snocan as the solution. | Incomplete |
I used the line after the end if as a template for the change. Is there a reason not to use the max function for snocan as we use it for liqcan? |
Incomplete |
The test passes; however, my troubleshooting reveals negative ice_mass (in the hundreds of kg/m2) in subr. ComputeLiqIceMassNonLake after subtracting dynbal_baseline_ice(c). I opened #2366 for this. | Incomplete |
Open new issue about fsurdat versioning and put in IF TIME ALLOWS and add "next" | Completed |
I'm unclear whether we want 16pft files or the more versatile 78pft files. | Incomplete |
I think the --vic option does not change the name of the fsurdat file, does it? If not, then the Makefile will overwrite (?) the vic files with the non-vic files (or vice versa). | Incomplete |
1x1_brazil is not among the 16-pft files | Incomplete |
Removed resolutions that should not look for an 1850 file and added a resolution that should look for a 2000 file. | Incomplete |
Moved smallville out of the 2000 cases because it should be looking for an 1850 file. | Incomplete |
1x1_brazil and 5x5_amazon are not among the 16-pft files | Incomplete |
The next two fail in MODEL_BUILD as "intel" but I expect them to work as "gnu" based on a third test that behaved the same way: | Completed |
Then we have | Completed |
SMS_Ld5.f09_g17.ISSP460Clm50BgcCrop.derecho_intel.clm-ciso_dec2050Start |
This one complains ERROR: No stream_entry presaero.SSP4-6.0 found' |
Completed |
Solved. | Incomplete |
I updated the post (above) to include possible .nc files involved in the failure. | Incomplete |
I opened an issue in CDEPS. | Incomplete |
Open another issue in CDEPS or does this go somewhere else? | Incomplete |
Yes. TODO open issue. | Completed |
Open another issue in CDEPS or does this go somewhere else? | Incomplete |
Open an issue? | Incomplete |
Logistically what are the steps to resolve it? | Incomplete |