MKL, CUDA and CPU affinity

When the USE_CUDA flag is set to true for compiling and running the XTP functionality, the library will distribute some matrix multiplication over the Nvidia GPU and the available CPUs.

If the MKL module has not been activated during the compilation you may notice that performance degrades as more OpenMP threads are used. This issue is solved by pinning the OpenMP threads to a CPU. If you are using GCC then you can use the GOMP_CPU_affinity environmental variable to pin the CPUs. If you are using the Intel compiler, have a look at the KMP_AFFINITY environmental variable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MKL, CUDA and CPU affinity

Clone this wiki locally