Skip to content

Commit

Permalink
ocl-misc: adjusted some defaults and updated documentation (#445)
Browse files Browse the repository at this point in the history
* Operate on a unique list of triplets (tune_multiply.sh). Adjusted documentation.
* Made ACC_OPENCL_DEVSPLIT=1 the default (disable with ACC_OPENCL_DEVSPLIT=0).
* Code/format cleanup, updated LIBXSMM (Daint-CI).
  • Loading branch information
hfp authored Apr 26, 2021
1 parent fb3c5de commit 8a30155
Show file tree
Hide file tree
Showing 5 changed files with 7 additions and 4 deletions.
2 changes: 1 addition & 1 deletion .ci/daint.cscs.ch/ocl.build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ if [ ! -d "${HOME}/libxsmm" ]; then
fi
cd "${HOME}/libxsmm"
git fetch
git checkout 66bab63ee6c7565b725e1fb70f66cdb2fa507250
git checkout af0dcb9018ed1f92173d60bc04dcf00a4e1854f1
make -j
cd ..

Expand Down
2 changes: 1 addition & 1 deletion src/acc/opencl/acc_opencl.c
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ int c_dbcsr_acc_init(void)
cl_uint j = 0, n;
for (; j < ndevices; ++j) {
#if defined(CL_VERSION_1_2)
if ( (NULL == env_device_split || '0' == *env_device_split)
if ( (NULL != env_device_split && '0' == *env_device_split)
|| (c_dbcsr_acc_opencl_ndevices + 1) == ACC_OPENCL_DEVICES_MAXCOUNT
|| (CL_SUCCESS != clCreateSubDevices(devices[j], properties, 0, NULL, &n)))
#endif
Expand Down
2 changes: 1 addition & 1 deletion src/acc/opencl/smm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,4 +130,4 @@ cd src/acc/opencl/smm
./tune_multiply.sh 300 8 1 4 10 15, 6 7 8, 23
```

The script `tune_multiply.sh` is tuning 1444 kernels by default (`./acc_bench_smm 300 8 1` taking approximately 15 hours per part). If the process is interrupted earlier (per SIGINT or Ctrl-C), the execution terminates for all requested kernels (triplet specification) unless an environment variable `CONTINUE=1` is set (proceeds to the next kernel).
The script `tune_multiply.sh` is tuning 1266 kernels by default (`./tune_multiply.sh 300 8 1` taking approximately 13 hours per part). If the process is interrupted earlier (per SIGINT or Ctrl-C), the execution terminates for all requested kernels (triplet specification) unless an environment variable `CONTINUE=1` is set (proceeds to the next kernel).
2 changes: 1 addition & 1 deletion src/acc/opencl/smm/opencl_libsmm.c
Original file line number Diff line number Diff line change
Expand Up @@ -973,7 +973,7 @@ int libsmm_acc_process(const int* host_param_stack, const int* dev_param_stack,
if (EXIT_SUCCESS == result) {
const double gflops = (2.0 * m_max * n_max * k_max * stack_size) / duration;
# if LIBXSMM_VERSION3(1, 16, 1) <= LIBXSMM_VERSION3(LIBXSMM_VERSION_MAJOR, \
LIBXSMM_VERSION_MINOR, LIBXSMM_VERSION_UPDATE) && 1159 <= LIBXSMM_VERSION_PATCH
LIBXSMM_VERSION_MINOR, LIBXSMM_VERSION_UPDATE) && 1159 <= LIBXSMM_VERSION_PATCH
const size_t size = sizeof(config->size) / sizeof(*config->size); assert(2 <= size);
libxsmm_kahan_sum(log(gflops), &config->gflops_sumlog, &config->gflops_comp);
if (size <= config->nexec) {
Expand Down
3 changes: 3 additions & 0 deletions src/acc/opencl/smm/tune_multiply.sh
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ if [ "${SED}" ] && [ "${LS}" ] && [ "${RM}" ] && [ "${WC}" ]; then
SPEC=$(echo "${SPECS}" | ${SED} -e "s/^x//g" -e "s/x$//g" -e "s/x/,/g")
MNKS="${MNKS} $(eval printf "%s" "{${SPEC}}x{${SPEC}}x{${SPEC}}\" \"" | ${SED} -e "s/{//g" -e "s/}//g")"
done
if [ "$(command -v sort)" ] && [ "$(command -v xargs)" ]; then
MNKS=$(echo "${MNKS}" | xargs -n1 | sort -u | xargs)
fi
NTRIPLETS=$(echo "${MNKS}" | wc -w)
PARTSIZE=$(((NTRIPLETS+NPARTS-1)/NPARTS))
PARTOFFS=$(((PART-1)*PARTSIZE))
Expand Down

0 comments on commit 8a30155

Please sign in to comment.