-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify access mode in __result_and_scratch_storage
methods
#1909
Specify access mode in __result_and_scratch_storage
methods
#1909
Conversation
…add template parameter sycl::access_mode into __usm_or_buffer_accessor Signed-off-by: Sergey Kopienko <[email protected]>
…add template parameter sycl::access_mode into __result_and_scratch_storage::__get_result_acc Signed-off-by: Sergey Kopienko <[email protected]>
…add template parameter sycl::access_mode into __result_and_scratch_storage::__get_scratch_acc Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please instigate these yourself as well but i tried to assess each, mostly looks good with structure. we will need to confirm with some testing of course.
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_utils.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_utils.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce.h
Outdated
Show resolved
Hide resolved
…cess mode for __get_result_acc in the __parallel_scan_submitter::operator() - __final_event Signed-off-by: Sergey Kopienko <[email protected]>
…cess mode for __get_scratch_acc in the __parallel_scan_submitter::operator() - __submit_event Signed-off-by: Sergey Kopienko <[email protected]>
…cess mode for __get_scratch_acc in the __parallel_scan_submitter::operator() - __submit_event Signed-off-by: Sergey Kopienko <[email protected]>
… fix review comment: fix extra access mode specification in __parallel_transform_reduce_impl::submit Signed-off-by: Sergey Kopienko <[email protected]>
… fix access mode for __get_result_acc in the __parallel_transform_reduce_impl::submit - __reduce_event Signed-off-by: Sergey Kopienko <[email protected]>
… fix access mode for __get_scratch_acc in the __parallel_transform_reduce_impl::submit - __reduce_event Signed-off-by: Sergey Kopienko <[email protected]>
… fix access mode for __get_scratch_acc in the __parallel_transform_reduce_work_group_kernel_submitter::operator() Signed-off-by: Sergey Kopienko <[email protected]>
…remove default value for template parameter _AccessMode at struct __usm_or_buffer_accessor Signed-off-by: Sergey Kopienko <[email protected]>
…n_scan.h - fix review comment: fix extra access mode specification in __parallel_reduce_then_scan_reduce_submitter::operator() Signed-off-by: Sergey Kopienko <[email protected]>
…n_scan.h - fix access mode for __get_scratch_acc in the __parallel_reduce_then_scan_scan_submitter::operator() Signed-off-by: Sergey Kopienko <[email protected]>
…declare __usm_or_buffer_accessor::__get_access_mode_tag() as constexpr Signed-off-by: Sergey Kopienko <[email protected]>
…fix error in the __get_result_acc implementation Signed-off-by: Sergey Kopienko <[email protected]>
…fix error in the __get_scratch_acc implementation Signed-off-by: Sergey Kopienko <[email protected]>
…_or_buffer_accessor
… fix access mode for __get_scratch_acc in the __parallel_transform_reduce_work_group_kernel_submitter::operator() Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
…fix error in the __usm_or_buffer_accessor::__get_access_mode_tag() implementation Signed-off-by: Sergey Kopienko <[email protected]>
…acc calls Signed-off-by: Sergey Kopienko <[email protected]>
…_acc calls Signed-off-by: Sergey Kopienko <[email protected]>
…ed at all due we shouldn't construct sycl::asseccor twice in the __usm_or_buffer_accessor::__usm_or_buffer_accessor Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_utils.h
Outdated
Show resolved
Hide resolved
I think in reality most of the target cases we are targeting for performance have access to USM and this will therefore be no change to performance. We could test it on some devices which don't support USM or force it into non-USM mode for the purposes of benchmarking but I'm unsure how useful that is. |
I do think that this can indirectly help with performance in our target cases by getting us closer to #1906. Switching the existing temporary storage that we have in buffers over to usm where it is available generally does provide a performance improvement. This PR improves the infrastructure to better enable using it elsewhere. |
…fix error: the usages of __dpl_sycl::__no_init{} not quite correct Signed-off-by: Sergey Kopienko <[email protected]>
…y __dpl_sycl::__no_init{} for sycl::access_mode::write Signed-off-by: Sergey Kopienko <[email protected]>
… specify __dpl_sycl::__no_init{} for sycl::access_mode::write Signed-off-by: Sergey Kopienko <[email protected]>
…n_scan.h - specify __dpl_sycl::__no_init{} for sycl::access_mode::write Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
…n_scan.h - fix review comment: I think this shouldn't be no_init, we read before write. Signed-off-by: Sergey Kopienko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed a couple of compilation errors in the buffer fallback path. I think it would be good to manually test the following scenarios by temporarily modifying the macros / function returns:
- The most common case:
_ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT=1
,__use_USM_host_allocations=true
- USM device allocations but no host memory:
_ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT=1
,__use_USM_host_allocations=false
. Testing on a CUDA / HIP device would also take this path. - The buffer fallback case:
_ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT=0
. This is where we would probably find all issues related to access modes.
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_utils.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_utils.h
Outdated
Show resolved
Hide resolved
…fix compile error in the __result_and_scratch_storage for case when _ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT is undefined Signed-off-by: Sergey Kopienko <[email protected]>
…fix compile error in the __result_and_scratch_storage for case when _ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT is undefined Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
ac67c14
to
b634f44
Compare
…parametrize struct __usm_or_buffer_accessor only by accessor Signed-off-by: Sergey Kopienko <[email protected]>
@danhoeflinger, @mmichel11 Could you please take a look again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In this PR we specify access mode in the two
__result_and_scratch_storage
methods:__result_and_scratch_storage::__get_result_acc
__result_and_scratch_storage::__get_scratch_acc
The main idea - to write
<sycl::access_mode::write>
is more correct then<sycl::access_mode::read_write>
when we only write some data and to write<sycl::access_mode::read>
is more correct then<sycl::access_mode::read_write>
when we only read some data.