Skip to content

oneDPL 2022.3.0 release

Compare
Choose a tag to compare
@ValentinaKats ValentinaKats released this 22 Nov 12:26
· 383 commits to main since this release
180f18a

New Features

  • Added an experimental feature to dynamically select an execution context, e.g., a SYCL queue.
    The feature provides selection functions such as select, submit and submit_and_wait,
    and several selection policies: fixed_resource_policy, round_robin_policy,
    dynamic_load_policy, and auto_tune_policy.
  • unseq and par_unseq policies now enable vectorization also for Intel® oneAPI DPC++/C++ Compiler.
  • Added support for passing zip iterators as segment value data in reduce_by_segment,
    exclusive_scan_by_segment, and inclusive_scan_by_segment.
  • Improved performance of the merge, sort, stable_sort, sort_by_key,
    reduce, min_element, max_element, minmax_element, is_partitioned, and
    lexicographical_compare algorithms with DPC++ execution policies.

Fixed Issues

  • Fixed the reduce_async function to not ignore the provided binary operation.

New Known Issues and Limitations

  • When compiled with -fsycl-pstl-offload option of Intel® oneAPI DPC++/C++ compiler and with
    libstdc++ version 8 or libc++, oneapi::dpl::execution::par_unseq offloads
    standard parallel algorithms to the SYCL device similarly to std::execution::par_unseq
    in accordance with the -fsycl-pstl-offload option value.
  • When using the dpl modulefile to initialize the user's environment and compiling with -fsycl-pstl-offload
    option of Intel® oneAPI DPC++/C++ compiler, a linking issue or program crash may be encountered due to the directory
    containing libpstloffload.so not being included in the search path. Use the env/vars.sh to configure the working
    environment to avoid the issue.
  • Compilation issues may be encountered when passing zip iterators to exclusive_scan_by_segment on Windows.
  • Incorrect results may be produced by set_intersection with a DPC++ execution policy,
    where elements are copied from the second input range rather than the first input range.
  • For transform_exclusive_scan and exclusive_scan to run in-place (that is, with the same data
    used for both input and destination) and with an execution policy of unseq or par_unseq,
    it is required that the provided input and destination iterators are equality comparable.
    Furthermore, the equality comparison of the input and destination iterator must evaluate to true.
    If these conditions are not met, the result of these algorithm calls is undefined.
  • sort, stable_sort, sort_by_key, partial_sort_copy algorithms may work incorrectly or cause
    a segmentation fault when used a DPC++ execution policy for CPU device, and built
    on Linux with Intel® oneAPI DPC++/C++ Compiler and -O0 -g compiler options.
    To avoid the issue, pass -fsycl-device-code-split=per_kernel option to the compiler.
  • Incorrect results may be produced by exclusive_scan, inclusive_scan, transform_exclusive_scan,
    transform_inclusive_scan, exclusive_scan_by_segment, inclusive_scan_by_segment, reduce_by_segment
    with unseq or par_unseq policy when compiled by Intel® oneAPI DPC++/C++ Compiler
    with -fiopenmp, -fiopenmp-simd, -qopenmp, -qopenmp-simd options on Linux.
    To avoid the issue, pass -fopenmp or -fopenmp-simd option instead.
  • Incorrect results may be produced by reduce and transform_reduce with 64-bit types and std::multiplies,
    sycl::multiplies operations when compiled by Intel® C++ Compiler 2021.3 and newer and executed on GPU devices.