You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's an upcoming PR to introduce GFDL Land Model in UFS (ufs-community/ufs-weather-model#2146) . It is unable to run with -DDEBUG=ON flag, when using the FMS module provided by spack-stack. There is no issue when using an un-optimized compile of FMS 2023.04 with debug flags. However, this is not available to UFS in the modules.
To Reproduce
# recreate failed test
git clone -b feature/LM4 --recursive [email protected]:JustinPerket/ufs-weather-model.git ufs-LM4
cd ufs-LM4/tests
# change regression test to debug # currently is : COMPILE | datm_cdeps_lm4 | intel | -DAPP=LND-LM4 | + hera orion gaea | fv3 |# want: COMPILE | datm_cdeps_lm4 | intel | -DAPP=LND-LM4 -DDEBUG=ON | + hera orion gaea | fv3 |
sed -i 's|-DAPP=LND-LM4|-DAPP=LND-LM4 -DDEBUG=ON|g' lm4_tests.conf
# run LM4 regression tests, resulting in crash
./rt.sh -k -l lm4_tests.conf
I'll mostly quote @J-Lentz explanation from email:
because the [release build FMS module] is being used, the calculations inside the where clause in monin_obukhov_solve_zeta are speculatively executed without regard for which indices satisfy the masking condition, and in particular, calculations are performed for indices where division by zero occurs. As long as floating point exceptions are disabled, this is benign because the resulting NaN or infinity values are discarded due to the masking condition. But the FMS code inherits the floating point environment of the main program [UFS], and in particular, if [UFS] is built with the -fpe0 flag, then division by zero in the FMS code will trigger a fatal exception, regardless of whether FMS itself was built with -fpe0.
To avoid this issue, if UFS is built with CMake flag -DDEBUG=ON, it then would require use of a debug build of FMS to be available from the spack-stack environment. It would be great to see this for the newer FMS version for spack-stack 1.6.0 on Hera and Gaea (#1215).
System:
Tested to occur on Hera, Gaea
Additional context
I tested using my own debug build of FMS , matching the spack-stack lua file options: -DGFS_PHYS=ON -DOPENMP=ON -DENABLE_QUAD_PRECISION=ON -DWITH_YAML=OFF -DCONSTANTS=GFS -D32BIT=ON -D64BIT=ON -DFPIC=ON -DUSE_DEPRECATED_IO=ON
but then added the debug flags -g -O0 -check -check noarg_temp_created -check nopointer -warn -warn noerrors -fpe0 -ftrapuv
Then I unloaded the FMS module, set FMS_ROOT to this build, and then the debug UFS-LM4 regressions test ran without issue.
Note that because of NOAA-GFDL/FMS#1532 , the behavior of CMAKE_Fortran_FLAGS_DEBUG changes to be more standard, starting with FMS 2024.02. Then the FMS CMake build options I used are simply:
I don't think we want a blanket debug fms for all applications in the unified environment. That can have real consequences on runtime. I am thinking that the correct course of action is to work with the FMS developers to address this problem by coding it differently and/or providing the correct flags and directives (for FMS and/or the UFS) to prevent this from happening when FMS is compiled in release mode. In the meantime, we can absolutely provide a debug FMS version in addition to the default release fms version on dedicated systems in spack-stack 1.8.0 (which will have [email protected]).
Thanks Dom. Unless something else starts using FMS's Monin Obukhov interface, this seems like the issue is limited to the GFDL LM4. Perhaps in my upcoming PR, I could tweak UFS's CMakeLists.txt to use debug-built FMS libraries only if UFS is also debug, and there's a LM4 app?
Thanks Dom. Unless something else starts using FMS's Monin Obukhov interface, this seems like the issue is limited to the GFDL LM4. Perhaps in my upcoming PR, I could tweak UFS's CMakeLists.txt to use debug-built FMS libraries only if UFS is also debug, and there's a LM4 app?
Describe the bug
There's an upcoming PR to introduce GFDL Land Model in UFS (ufs-community/ufs-weather-model#2146) . It is unable to run with
-DDEBUG=ON
flag, when using the FMS module provided by spack-stack. There is no issue when using an un-optimized compile of FMS 2023.04 with debug flags. However, this is not available to UFS in the modules.To Reproduce
The resulting crash occurs in this where statement within FMS monin_obukhov interface:
https://github.com/NOAA-GFDL/FMS/blob/7f585284/monin_obukhov/include/monin_obukhov_inter.inc#L227
Expected behavior
I'll mostly quote @J-Lentz explanation from email:
To avoid this issue, if UFS is built with CMake flag
-DDEBUG=ON
, it then would require use of a debug build of FMS to be available from the spack-stack environment. It would be great to see this for the newer FMS version for spack-stack 1.6.0 on Hera and Gaea (#1215).System:
Tested to occur on Hera, Gaea
Additional context
I tested using my own debug build of FMS , matching the spack-stack lua file options:
-DGFS_PHYS=ON -DOPENMP=ON -DENABLE_QUAD_PRECISION=ON -DWITH_YAML=OFF -DCONSTANTS=GFS -D32BIT=ON -D64BIT=ON -DFPIC=ON -DUSE_DEPRECATED_IO=ON
but then added the debug flags
-g -O0 -check -check noarg_temp_created -check nopointer -warn -warn noerrors -fpe0 -ftrapuv
Then I unloaded the FMS module, set
FMS_ROOT
to this build, and then the debug UFS-LM4 regressions test ran without issue.Note that because of NOAA-GFDL/FMS#1532 , the behavior of
CMAKE_Fortran_FLAGS_DEBUG
changes to be more standard, starting with FMS 2024.02. Then the FMS CMake build options I used are simply:The text was updated successfully, but these errors were encountered: