Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for transient Smallville tests #1673 + testing all new datasets #2318

Merged
merged 56 commits into from
Feb 16, 2024

Updating expected failures (+ adding an unrelated comment)

494ad7c
Select commit
Loading
Failed to load commit list.
Merged

Workaround for transient Smallville tests #1673 + testing all new datasets #2318

Updating expected failures (+ adding an unrelated comment)
494ad7c
Select commit
Loading
Failed to load commit list.
Task list completed / task-list-completed Started 2024-02-16 00:06:42 ago

26 / 28 tasks completed

2 tasks still to be completed

Details

Required Tasks

Task Status
I want to test the Makefile with this PR's changes. Run make all-subset for this. Completed
I would like to update the Makefile with all the grids that we need to generate. I might as well do that now because this is the last PR before I will generate all the datasets. Let me know if you disagree. Completed
Working in scratch to avoid running out of space: /glade/derecho/scratch/slevis/temp_work/ctsm5.2.mksurfdata/tools/mksurfdata_esmf Incomplete
4 jobs still in the queue Incomplete
On Monday look at which datasets worked and which didn't Incomplete
f10 hist now runs out of wallclock Incomplete
mpasa120, C96, f09, f19, hcru, ne16, ne30, ne3 SSPs now run out of wallclock Incomplete
Looks as though I have not generated explicit 1850-2015 landuse files except for f10, f45, ne3 (i.e. low-res). Although 1850-2015 is included in the SSP files that I generated, I think we decided to generate separate 1850-2015 files, so as to identify them easily. I will go back and do that. DONE, but we have now concluded that SSP2-4.5 will suffice. Completed
ne30 fsurdat files (1850, 2000) did not get generated, while landuse files were generated for 2 ne30 grids that we do not need. I will resolve the former and will not worry about the latter. Completed
surfdata_1x1_brazil...1850-2015...nc + numaIA + smallville should be renamed 1850. Completed
Related to the last point: Brazil and Numa should have been 2000 and Brazil also needs historical. Correct how files are named. Generate the files. Completed
1x1_numa needs 1850-2015 landuse file. DONE but turns out didn't need it. Completed
Change potveg file to surfdata_0.9x1.25_hist_PtVeg_nourb_16pfts_cXXXXXX.nc And landuse files 78 to 78pfts. Correct how they are named. Completed
Files in /python/ctsm/test/testinputs may not need updating, unless we want to rename... Completed
fsurdat DONE Incomplete
landuse DONE Incomplete
1st and 2nd attempts, results below Incomplete
Many 1x1 cases failed due to the wrong date stamp in namelist_defaults. I have updated this, tested one of the cases, and it worked. Incomplete
I am concerned about a large number of failures with this error: Incomplete
The error is the same even when I use COLDSTART, so nothing to do with init_interp. Incomplete
Look at the fsurdat file with ncview, I am not spotting a problem. Incomplete
I am thinking of ways to converge on the problem using git diff strategically, hoping to avoid time consuming debugging... Incomplete
UPDATE: Solved by setting convert_ocean_to_land = .true., and I think I need to make this the default option for tests to pass in general. Incomplete
Four grids not tested: f10, ne16pg3, mpasa60-3conus, mpasa60-3centralUS. I added the first two. The third and fourth are not supported in ccs_config, yet. Incomplete
f09 appears in the ctsm_sci test-suite in 23 tests and f19 appears in the ctsm_sci test-suite in 19 tests. Is this necessary or could we reduce to just a couple the way we do for other grids? Incomplete
Two cases fail in the build the same way (another like this appears in aux_clm below). I do not recognize the problem. See discussion with Erik in posts below. Completed
RXCROPMATURITY_Lm61.f09_g17.IHistClm50BgcCrop.derecho_intel.clm-cropMonthOutput
seems broken, because it modifies an fsurdat but then the model gives an error with it. Sam R. opened issue #2357, and I am running with the fix. Completed
SMS_Ld12_Mmpi-serial.1x1_urbanc_alpha.I1PtClm51SpRs.derecho_intel.clm-output_sp_highfreq
NetCDF: Variable not found in file /glade/derecho/scratch/csgteam/temp/spack/derecho/23.09/builds/spack-stage-parallelio-2.6.2-bpi7h2bnkshocep4hl3drfik4ebj44iu/spack-src/src/clib/pio_nc.c at line 1164
See discussion with Erik in posts below. Completed
SMS_Ld5.f09_g17.ISSP460Clm50BgcCrop.derecho_intel.clm-ciso_dec2050Start
ERROR: No stream_entry presaero.SSP4-6.0 found See discussion with Erik in posts below. Completed
Solved this one as the smallville cases discussed below:
SMS_Ly5_Mmpi-serial.1x1_smallvilleIA.IHistClm51BgcCropQianRs.derecho_intel.clm-gregorian_cropMonthOutput Completed
Set use_init_interp = .true. and rerun within test-suite. Tested one of them and it passed. The rest will get tested when I rerun the full test-suite: Completed
These two tests passed with use_init_interp = .true.. When I get to the finidat task, I will try again without the change but pointing to the generated finidat files. Completed
Submitted these two from the test-suite pointing to newly generated finidat files: Completed
Submitted these manually (ie outside the test-suite). The first one gets the wrong PE layout from P144x1 so changed to P64x1: Completed
Rerunning within test-suite: Completed
Looks similar to something I fixed in #2053. Rerunning within the test-suite with debugging write statements:
PEND LWISO_Ld10.f10_f10_mg37.I2000Clm50BgcCrop.derecho_gnu.clm-coldStart.GC.0204-153040de_gnu gives this error: Completed
The smallville dynLakes + dynUrban cases PASS with the recommendation from the error msg. I didn't confirm the rest of what's recommended: Completed
Two are I2000 cases that do not have fsurdat files; they work as I1850, which also makes them consistent with the spreadsheet. Incomplete
Five are IHist cases; they work with check_dynpft_consistency = .false., as I mentioned above for derecho. Tried them again without this change and pointing to the newly generated finidats, which reverted to the same error asking me to set check_dynpft_consistency = .false.. Incomplete
submitted aux_clm Incomplete
submitted ctsm_sci Incomplete
submitted three standalone tests marked as PEND in the long earlier post above. Incomplete
FAIL: Skip the endrun for ice1_grc and ice2_grc in the hopes of finding which subgrid term is the culprit. I tried that and found that the simulation actually completes successfully in that case. So on to the next ideas. Incomplete
FAIL: This second idea did not come up when I talked to Bill: Remove the "call truncate_small_values" for snocan in case that fixed a fluke error before, and we happen to not need that fix anymore. Bill is skeptical about the longevity of this outcome, even if it pans out. Incomplete
IN PROGRESS: Look at the terms making up ice_mass in TotalWaterAndHeatMod.F90's subroutines ComputeLiqIceMassNonLake and ComputeLiqIceMassLake to find the culprit. In doing so, I have found a source of negative ice_mass, likely in snocan. Depending my findings, I may advocate for eliminating negative snocan as the solution. Incomplete
I used the line after the end if as a template for the change. Is there a reason not to use the max function for snocan as we use it for liqcan? Incomplete
The test passes; however, my troubleshooting reveals negative ice_mass (in the hundreds of kg/m2) in subr. ComputeLiqIceMassNonLake after subtracting dynbal_baseline_ice(c). I opened #2366 for this. Incomplete
Open new issue about fsurdat versioning and put in IF TIME ALLOWS and add "next" Completed
I'm unclear whether we want 16pft files or the more versatile 78pft files. Incomplete
I think the --vic option does not change the name of the fsurdat file, does it? If not, then the Makefile will overwrite (?) the vic files with the non-vic files (or vice versa). Incomplete
1x1_brazil is not among the 16-pft files Incomplete
Removed resolutions that should not look for an 1850 file and added a resolution that should look for a 2000 file. Incomplete
Moved smallville out of the 2000 cases because it should be looking for an 1850 file. Incomplete
1x1_brazil and 5x5_amazon are not among the 16-pft files Incomplete
The next two fail in MODEL_BUILD as "intel" but I expect them to work as "gnu" based on a third test that behaved the same way: Completed
Then we have Completed
SMS_Ld5.f09_g17.ISSP460Clm50BgcCrop.derecho_intel.clm-ciso_dec2050Start
This one complains ERROR: No stream_entry presaero.SSP4-6.0 found' Completed
Solved. Incomplete
I updated the post (above) to include possible .nc files involved in the failure. Incomplete
I opened an issue in CDEPS. Incomplete
Open another issue in CDEPS or does this go somewhere else? Incomplete
Yes. TODO open issue. Completed
Open another issue in CDEPS or does this go somewhere else? Incomplete
Open an issue? Incomplete
Logistically what are the steps to resolve it? Incomplete