Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove most special handling of MKL in CMake configuration #1149

Merged
merged 32 commits into from
May 31, 2024

Conversation

msimberg
Copy link
Collaborator

@msimberg msimberg commented May 28, 2024

  • From version 0.5 onwards the spack package uses spack's blas/lapack libraries
  • The DLAF_WITH_MKL_LEGACY option is completely removed
  • MKL is auto-detected

This is currently untested, but early feedback more than welcome and the general structure is in place!

The first version of the MKL auto-detection simply uses try_compile with LAPACK_LIBRARY and a call to mkl_set_num_threads_local. It uses the result of this try_compile as the default value of DLAF_WITH_MKL. It's important to note that currently this will only be tested on the first configuration, so if the first configuration is wrong, and one then changes LAPACK_LIBRARY to have the correct paths, DLAF_WITH_MKL will still be off.

On the other hand, the above method means that one can simply set DLAF_WITH_MKL manually to on if one expects MKL to be there, and compilation will then fail if it's not there (instead of quietly disabling it).

The logic could be changed/improved to have different behaviour in the above case.

An alternative method is to simply use __has_include in single_threaded_blas.cpp. This doesn't allow overriding and is silent in either case (MKL found or not found).

@msimberg msimberg added this to the v0.5.0 milestone May 28, 2024
@msimberg msimberg self-assigned this May 28, 2024
@msimberg
Copy link
Collaborator Author

cscs-ci run

CMakeLists.txt Outdated Show resolved Hide resolved
ci/.gitlab-ci.yml Outdated Show resolved Hide resolved
@msimberg
Copy link
Collaborator Author

cscs-ci run

Copy link
Collaborator

@rasolca rasolca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The include directory is needed, otherwise non-spack builds are not possible.

Copy link
Collaborator

@albestro albestro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@msimberg
Copy link
Collaborator Author

cscs-ci run

CMakeLists.txt Outdated Show resolved Hide resolved
@msimberg msimberg requested a review from albestro May 29, 2024 12:32
@msimberg
Copy link
Collaborator Author

I've changed back the CUDA CI to use -Werror=all-warnings since we now don't get linker warnings with the newest version of Umpire. We will still get the warnings on builds with older versions of Umpire, but outside of CI those warnings should not be fatal. I opened #1154 as a reminder to potentially look at the other warnings.

Copy link
Collaborator

@rasolca rasolca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if and how to cache the try_compile result for detecting MKL

Not sure... maybe @albestro has an idea?

CMakeLists.txt Outdated Show resolved Hide resolved
@albestro
Copy link
Collaborator

if and how to cache the try_compile result for detecting MKL

Not sure... maybe @albestro has an idea?

I would say that it might be nicer to not cache, so that any change to LAPACK_LIBRARY/LAPACK_INCLUDE_DIR will be reflected also in the MKL_TRY_COMPILE result.

Otherwise, let's suppose we do this sequence

  • cmake -DLAPACK_LIBRARY=<working-config> ...
  • cmake -DLAPACK_LIBRARY=<non-working-config> ...

with the cached result for MKL we might end up getting BLAS/LAPACK error, because of the check_function_exists in the FindLAPACK that explicitly clear the cache, but at the same time we get that MKL is found correctly (because it is checked just at the first time).

So, I would suggest to do the same we do for LAPACK in FindLAPACK, i.e. unset(DLAF_WITH_MKL_TRY_COMPILE CACHE). And, trivially, in the future start using NO_CACHE where possible (apparently up to 3.29 it is possible for try_compile but not for check_function_exists).

@msimberg
Copy link
Collaborator Author

if and how to cache the try_compile result for detecting MKL

Not sure... maybe @albestro has an idea?

I would say that it might be nicer to not cache, so that any change to LAPACK_LIBRARY/LAPACK_INCLUDE_DIR will be reflected also in the MKL_TRY_COMPILE result.

Otherwise, let's suppose we do this sequence

* `cmake -DLAPACK_LIBRARY=<working-config> ...`

* `cmake -DLAPACK_LIBRARY=<non-working-config> ...`

with the cached result for MKL we might end up getting BLAS/LAPACK error, because of the check_function_exists in the FindLAPACK that explicitly clear the cache, but at the same time we get that MKL is found correctly (because it is checked just at the first time).

So, I would suggest to do the same we do for LAPACK in FindLAPACK, i.e. unset(DLAF_WITH_MKL_TRY_COMPILE CACHE). And, trivially, in the future start using NO_CACHE where possible (apparently up to 3.29 it is possible for try_compile but not for check_function_exists).

Currently on this branch, the try_compile result is used as the default value for DLAF_WITH_MKL. This is done with the idea that DLAF_WITH_MKL can be set explicitly by a user to signal that they expect MKL to be used or not used (leading to compilation failures if MKL isn't actually set up correctly). In this case it doesn't make much difference if DLAF_WITH_MKL_TRY_COMPILE is a cache variable or not, and if it's reset or not because it'll only be used once to set the default.

I think if we reset and redetect MKL every time then DLAF_WITH_MKL_TRY_COMPILE should just be DLAF_WITH_MKL. I haven't tried it yet, but in this case I think we can simply let try_compile create the variable. If the user has already set it the try_compile should be skipped (but as I said, not verified). Then a user can -UDLAF_WITH_MKL to redetect it if they don't want to set it explicitly.

As a last option, we try_compile into DLAF_WITH_MKL and don't cache the variable. This, however, means that the user can't override the option (as far as I can think of...).

@rasolca
Copy link
Collaborator

rasolca commented May 29, 2024

Not sure what to say... just realized that auto-detection might have false positive if the mkl include directory is present (e.g. if using mkl fftw).
So the possibility for the user should be available.
I think it should work like this:

DLAF_WITH_MKL: is a cached value set by user

DLAF_WITH_MKL try_compile -DDLAF_WITH_MKL
ON not run added
OFF not run skipped
Not defined OK added
Not defined FAIL skipped

If a variable is also needed in the rest of the cmake scripts I would use a different name.

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

I think it should work like this:

DLAF_WITH_MKL: is a cached value set by user
DLAF_WITH_MKL try_compile -DDLAF_WITH_MKL
ON not run added
OFF not run skipped
Not defined OK added
Not defined FAIL skipped

If a variable is also needed in the rest of the cmake scripts I would use a different name.

This sounds pretty reasonable. I made an attempt at implementing this logic in d5a7ae4.

jst realized that auto-detection might have false positive if the mkl include directory is present (e.g. if using mkl fftw).

Yeah, I was worried about something like this as well. Without intending to I think this now covers that quite reasonably:

if(DLAF_WITH_MKL)
# When using MKL there is no need to set the number of threads with
# omp_set_num_threads; it's sufficient to use MKL's own mechanisms.
set(DLAF_WITH_OPENMP OFF CACHE BOOL "${DLAF_WITH_OPENMP_DESCRIPTION}" FORCE)
. When DLAF_WITH_MKL is explicitly enabled we can safely disable OpenMP. If DLAF_WITH_MKL is undefined that first branch is always false so we'll keep OpenMP enabled (if it wasn't changed by the user). In the worst case we end up calling both omp_set_num_threads and mkl_set_num_threads, but we won't end up in the situation where we think we're using MKL, but we're not setting omp_set_num_threads.

In the spack package we set DLAF_WITH_MKL explicitly so there we only end up with one or the other, never both, enabled.

At least this is how I think it's working now and I think it seems reasonable, but not 100% sure.

I think warning/status messages can still be tweaked, but I'll leave that up to you to comment about.

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg mentioned this pull request May 30, 2024
CMakeLists.txt Show resolved Hide resolved
Copy link
Member

@RMeli RMeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

ci/common-ci.yml Outdated Show resolved Hide resolved
cmake/FindLAPACK.cmake Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
cmake/template/DLAFConfig.cmake.in Show resolved Hide resolved
@msimberg
Copy link
Collaborator Author

cscs-ci run

@rasolca rasolca requested a review from albestro May 31, 2024 12:08
@rasolca rasolca merged commit 62718b9 into eth-cscs:master May 31, 2024
5 checks passed
@msimberg msimberg deleted the mkl-spack-cmake branch May 31, 2024 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority:Urgent Type:Bug Something isn't working
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants