-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to build UFS-WM on MacOS platform with clang@15/[email protected] #2551
base: develop
Are you sure you want to change the base?
Conversation
@grantfirl new fv3 hash is NOAA-EMC/fv3atm@7d99880 |
no changes from develop branch for the build.sh script
Successfully tested the build of the code (S2SWA) with the hash 7d99880 on NOAA AWS MacOS instance. |
I can't review this because I don't have a mac to test on.
…On Thu, Jan 9, 2025 at 5:21 PM Natalie Perlin ***@***.***> wrote:
@natalie-perlin <https://github.com/natalie-perlin> requested your review
on: #2551 <#2551>
Changes to build UFS-WM on MacOS platform with ***@***.******@***.***
—
Reply to this email directly, view it on GitHub
<#2551 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANDS4FTDMEM5R7I4TAAYVNL2J3Y5XAVCNFSM6AAAAABUW5GLMOVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJVHA3TIMRVGY3DSMY>
.
You are receiving this because your review was requested.Message ID:
***@***.***
com>
--
George W Vandenberghe
*Lynker Technologies at * NOAA/NWS/NCEP/EMC
5830 University Research Ct., Rm. 2141
College Park, MD 20740
***@***.***
301-683-3769(work) 3017751547(cell)
|
@natalie-perlin do you think you can combine ufs_macosx.gnu into ufs_macosx.gnu.lua? I mean like https://github.com/ufs-community/ufs-weather-model/blob/develop/modulefiles/ufs_wcoss2.intel.lua. Also, some test result or instruction will be helpful for people using mac even in sequential mode. |
@natalie-perlin I've been able to build on my system (m2, sonoma 14.5) with a few modifications for uwm (instead of srw). I'm having problems actually building w/ the stack though, somewhere in the module loading (?). I've merged in your branch w/ a few changes like so:
Steps to build:
Which gives
|
@natalie-perlin I'm not sure about the thumbs up? Is that for the build vs the actual compile using the build (which is failing)... |
@DeniseWorthen - thank you for testing!..
Keeping the ufs_macosx.gnu as a bash file and not converting it to *.lua modulefile (as Jong @jkbk2004 suggested, I think) also solves the issue with environmental variables being available in bash modulefile, but not in *.lua file: |
@barlage - Are you using the older UFS model code, or the code from this PR?.. So yes, the previous discussion resolved that issue, but the current PR is the one addresses and implements these changes , in partucular, to the ./ufs-weather-model/CMakeLists.txt |
@natalie-perlin I also added to the end of the previous comment, i.e., it looks like the CXX linker is being used. Here's my set-up:
|
@barlage - It is possible that other additional libraries installed are getting in the way? In contrary to the sequence of errors in #2371 (comment) , there is no need to install llvm_openmp and to use "-DCMAKE_SHARED_LINKER_FLAGS="${llvm_openmp_ROOT}/lib/libomp.dylib" flag. |
@barlage - is there a chance to look at the spack-stack-1.8.0 build log, and at the ufs_model build log (with the BUILD_VERBOSE=1 option), to see the paths and the libraries are being linked against? |
I did not use any of these previous CMAKE flags. I'll rebuild ufs_weather_model and upload the build logs. |
Adding the logs from the SRW runs that are based on ATM-only UFS-WM, one machine uses openmpi/4.1.6 , another uses openmpi/5.0.3: x86_64, Sonoma OS, openmpi/4.1.6: SRW_log.fcst.001.txt |
Thank you!.. Let me look through the logs for any clues. |
@barlage - Thank you for the logs!.. They look totally fine and as expected I've found a culprit that changes the runtime outcome, it's an additional LDFLAG =" -Wl,-no_compact_unwind" added at a later stage in my testing. The flag " -Wl,-no_compact_unwind" prevents the ld warnings during the build, but results in the runtime error. If not used for linking, there are several warnings generated, but the ufs_model executable assembled works fine. There may still be some other compiler/linker flags, other than "-Wl, -no_compact_unwind" to prevent numerous linker warnings, but so far they are not causing any issues during the runtime. Will update my PR momentarily, after a couple of more things to try. The bottom line, the ufs_macos.gnu.lua should have the following (note there is no ldflags_add variable used):
|
What does this flag actually do at load time?
…On Wed, Jan 15, 2025 at 5:15 PM Natalie Perlin ***@***.***> wrote:
@barlage <https://github.com/barlage> - Thank you for the logs!.. They
look totally fine and as expected
I've found a culprit that changes the runtime outcome, it's an additional
LDFLAG =" -Wl,-no_compact_unwind" added at a later stage in my testing.
The flag " -Wl,-no_compact_unwind" prevents the ld warnings during the
build, but results in the runtime error. If not used for linking, there are
several warnings generated, but the ufs_model executable assembled works
fine.
There may still be some other compiler/linker flags, other than "-Wl,
-no_compact_unwind" to prevent numerous linker warnings, but so far they
are not causing any issues during the runtime.
As for the SRW, even if this flag is used, all other binaries except for
the ufs_model work fine as well.
Will update my PR momentarily, after a couple of more things to try.
The bottom line, the ufs_macos.gnu.lua should have the following (note
there is no ldflags_add variable used):
local libjpeg_ROOT = os.getenv("libjpeg_turbo_ROOT")
local jasper_ROOT = os.getenv("jasper_ROOT")
local libpng_ROOT = os.getenv("libpng_ROOT")
local ldflags0 = os.getenv("LDFLAGS") or ""
if jasper_ROOT and libpng_ROOT and libjpeg_ROOT then
local ldflags1 = " -L" .. libjpeg_ROOT .. "/lib -ljpeg -Wl,-rpath," .. libjpeg_ROOT .. "/lib"
local ldflags2 = " -L" .. jasper_ROOT .. "/lib -ljasper -Wl,-rpath," .. jasper_ROOT .. "/lib"
local ldflags3 = " -L" .. libpng_ROOT .. "/lib -lpng -Wl,-rpath," .. libpng_ROOT .. "/lib"
local ldflags = ldflags0 .. ldflags_add .. ldflags1 .. ldflags2 .. ldflags3
setenv("LDFLAGS", ldflags)
end
—
Reply to this email directly, view it on GitHub
<#2551 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANDS4FXFXEJ3DL5W442DTIT2K2JRBAVCNFSM6AAAAABUW5GLMOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJTGUYDCOJWGA>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
--
George W Vandenberghe
*Lynker Technologies at * NOAA/NWS/NCEP/EMC
5830 University Research Ct., Rm. 2141
College Park, MD 20740
***@***.***
301-683-3769(work) 3017751547(cell)
|
The flag "-Wl,-no_compact_unwind " is used during the linking time, or rather it is need not to be used for this case. The info found on error warnings generated at a linking stage when this flag is not used are below. All the issues are related to additional CXX libraries required by the ESMF. Note that all ESMF unit tests all pass successfully as they use mpicxx linker. The warning: (*) Compact unwind information is a data structure used by macOS for efficient stack unwinding during:
Why the Warning Occurs:
What Are the Consequences?
-- Crashes or undefined behavior when exceptions are thrown. How to Resolve the Warning
|
Good news! I just ran the first ever successful UFS simulation on my Mac laptop with the most recent @natalie-perlin commit (55e52e5), the only modification being the system: M3/Sonoma 14.7.2/Xcode 15.4/CLT 15.3 |
@barlage That's great! I'm inspired... A question--do you get innumerable pop up windows saying something about allowing incoming connections (something like that) when you run? |
@DeniseWorthen I didn't get anything like that with this simulation. |
@barlage - thank you so much for confirming it worked well for you!! |
@natalie-perlin I was able to compile on my system after your changes. I will try this weekend to actually run. |
@natalie-perlin I'm now able to build and run the ultra-low resolution (C12-9deg) coupled configuration I've been working on. This is for my M2, 14.5 studio (12 cores). There are some issues though.... I ended up rebuilding my 1.8, mostly because I ended up w/ two cmake installations and I was not able (even w/ spack uninstall /hash) to resolve it. In my rebuild, I believe I essentially was able to do everything in the instructions, w/ a few tweaks to the site/packages to align w/ uwm (eg not building metplus) Here are the issues I'm still seeing:
But the compile still succeeds so I'm not sure what exactly the issue is.
I also have available a DATM-S2SW configuration. I'll try w/ that tomorrow. |
Denise, thanks for testing! To answer your questions -
MacA/x86_64, Sonoma:
MacC/M2, Ventura: As to testing the confuguration builds, I have tested the following:
|
Thanks. The new C12-9deg configuration for S2S is meant to run on only 11 tasks, even on HPCs. It's obviously not designed to test science, but thought it might be small enough for what I have. EDIT: I'm not sure what I did differently, but I am now able to run a 1-day C12-9deg coupled configuration
|
@natalie-perlin I've also been able to build and run 1-day of my C12-9deg using my M4 (Sequoia 15.2, clang 16.0.0). I did set all the RELEASE flags to be -01 to get it to compile. |
@DeniseWorthen it seems you've had success, but for completeness, my 1-day C24 ATM run took about 1 minute to run. I'm using the minimum (I believe) 7 processors and writing output every 20 minute time step, but only a few variables. |
@barlage Yes, I did reboot my system (for a different reason) but I'm not sure whether that was the "fix". Also, this is these are the pop-ups I see I believe it has something to do w/ code-signing. I seems like a "sticky" setting for the same executable name so that if I allow it the first time, I don't get further popups. |
@natalie-perlin The only issue I saw w/ my M4 install were a lot of messages of this type:
|
@DeniseWorthen - note that a PR to fix the openmpi module issue has been submitted: JCSDA/spack-stack#1465 |
@natalie-perlin Not to get ahead of ourselves, but do you have any ideas about the |
At which point of preparing system dependencies for spack-stack or spack-stack installation steps ( as outlined in GoogleDoc ) does this message appear? What system is used (M4, OS ...?, clang 16.0)? |
@natalie-perlin This was for M4 (Sequoia 15.2, clang 16.0.0), basically following the same steps as outlined in the document. The message appears 576 times in the install.log. I've uploaded the log to this location: https://drive.google.com/file/d/1ivZjq95Wp1Qo1fJHXlVDB61Xgmt9TLfF/view?usp=drive_link |
Description:
Updates to build UFS WM on MacOSX platforms, Ventura or Sonoma OS, [email protected], [email protected]
openmpi/5.0.3 (or 4.1.6, tested as well) is built as a part of the spack-stack-1.8.0.
Tested on three MacOS systems:
A: x86_64, Sonoma OS 14.7.2,, XCode 15.4, [email protected], [email protected]
B: M1 , Sonoma OS 14.7.2, XCode 15.4, [email protected], [email protected]
C: (NOAA AWS MacOS instance) M2, Ventura OS 13.6.9, xcode-select command-line tools only; [email protected], [email protected]
Files changed with the options added for MacOS:
Running of the UFS-WM was tested as a part of the UFS-SRW App, successfully ran a standard community test. A corresponding PR in the UFS-SRW repo:
ufs-community/ufs-srweather-app#1171
TESTING THE BUILD:
NB: Set the path of your local spack-stack environment location as env. variable
stackpath
in ./modulefiles/ufs_macosx.gnu.lua, and adjust versions of packages/modules if needed.i) load the modulefiles:
ii) set env. variable:
export CMAKE_FLAGS="-DAPP= ... -DCCPP_SUITES="
and
iii) running the ./build.sh script, i.e.:
./build.sh 2>&1 | tee log.build.ufs.001
./tests/compile.sh
, e.g. :Log files from MacOS systems, built using ./compile.sh script
System A:
MacA.build_log.s2swa.txt
MacA_compile_s2swa.gnu_time.log.txt
MacA.modules.fv3_s2swa.gnu.lua.txt
System C:
MacC.build_log.s2swa.txt
MacC.compile_s2swa.gnu_time.log.txt
MacC.modules.fv3_s2swa.gnu.lua.txt
Priority:
(TBD)
Git Tracking
No regular testing of UFS-WM on MacOS systems is being done
This PR addresses the issues from #2371,
and uses the solution proposed.
UFSWM Blocking Dependencies:
Uses spack-stack-1.8.0
#2453
except for using mapl-2.40.3-esmf-8.6.0 required for the current ufs-wm build
At the moment, all the modules are loaded in the ufs_macosx.gnu.lua file; ufs_common.lua not used
Changes
Library Changes/Upgrades:
Directions for building spack-stack
Detailed directions to build spack-stack-1.8.0 with the software versions used in this PR:
https://docs.google.com/document/d/1Z0L7eujZGtyeZRzcgguyZPsZpkwb2Om7UhFqQPVtxnE/edit?usp=sharing
Testing Log: