Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify access mode in __result_and_scratch_storage methods #1909

Conversation

SergeyKopienko
Copy link
Contributor

@SergeyKopienko SergeyKopienko commented Oct 17, 2024

In this PR we specify access mode in the two __result_and_scratch_storage methods:

  • __result_and_scratch_storage::__get_result_acc
  • __result_and_scratch_storage::__get_scratch_acc

The main idea - to write <sycl::access_mode::write> is more correct then <sycl::access_mode::read_write> when we only write some data and to write <sycl::access_mode::read> is more correct then <sycl::access_mode::read_write> when we only read some data.

…add template parameter sycl::access_mode into __usm_or_buffer_accessor

Signed-off-by: Sergey Kopienko <[email protected]>
…add template parameter sycl::access_mode into __result_and_scratch_storage::__get_result_acc

Signed-off-by: Sergey Kopienko <[email protected]>
…add template parameter sycl::access_mode into __result_and_scratch_storage::__get_scratch_acc

Signed-off-by: Sergey Kopienko <[email protected]>
Copy link
Contributor

@danhoeflinger danhoeflinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please instigate these yourself as well but i tried to assess each, mostly looks good with structure. we will need to confirm with some testing of course.

…cess mode for __get_result_acc in the __parallel_scan_submitter::operator() - __final_event

Signed-off-by: Sergey Kopienko <[email protected]>
…cess mode for __get_scratch_acc in the __parallel_scan_submitter::operator() - __submit_event

Signed-off-by: Sergey Kopienko <[email protected]>
…cess mode for __get_scratch_acc in the __parallel_scan_submitter::operator() - __submit_event

Signed-off-by: Sergey Kopienko <[email protected]>
… fix review comment: fix extra access mode specification in __parallel_transform_reduce_impl::submit

Signed-off-by: Sergey Kopienko <[email protected]>
… fix access mode for __get_result_acc in the __parallel_transform_reduce_impl::submit - __reduce_event

Signed-off-by: Sergey Kopienko <[email protected]>
… fix access mode for __get_scratch_acc in the __parallel_transform_reduce_impl::submit - __reduce_event

Signed-off-by: Sergey Kopienko <[email protected]>
… fix access mode for __get_scratch_acc in the __parallel_transform_reduce_work_group_kernel_submitter::operator()

Signed-off-by: Sergey Kopienko <[email protected]>
…remove default value for template parameter _AccessMode at struct __usm_or_buffer_accessor

Signed-off-by: Sergey Kopienko <[email protected]>
…n_scan.h - fix review comment: fix extra access mode specification in __parallel_reduce_then_scan_reduce_submitter::operator()

Signed-off-by: Sergey Kopienko <[email protected]>
…n_scan.h - fix access mode for __get_scratch_acc in the __parallel_reduce_then_scan_scan_submitter::operator()

Signed-off-by: Sergey Kopienko <[email protected]>
…declare __usm_or_buffer_accessor::__get_access_mode_tag() as constexpr

Signed-off-by: Sergey Kopienko <[email protected]>
…fix error in the __get_result_acc implementation

Signed-off-by: Sergey Kopienko <[email protected]>
…fix error in the __get_scratch_acc implementation

Signed-off-by: Sergey Kopienko <[email protected]>
… fix access mode for __get_scratch_acc in the __parallel_transform_reduce_work_group_kernel_submitter::operator()

Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
…fix error in the __usm_or_buffer_accessor::__get_access_mode_tag() implementation

Signed-off-by: Sergey Kopienko <[email protected]>
…ed at all due we shouldn't construct sycl::asseccor twice in the __usm_or_buffer_accessor::__usm_or_buffer_accessor

Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
@danhoeflinger
Copy link
Contributor

@danhoeflinger, @dmitriy-sobolev So in this PR we have as result two places with <sycl::access_mode::read> specification:

  • the first in the __parallel_scan_submitter::operator() - inside kernel for __final_event
  • the second in the __parallel_transform_reduce_work_group_kernel_submitter::operator()

I believe we may have some performance boost at these places. Could somebody check this point?

I think in reality most of the target cases we are targeting for performance have access to USM and this will therefore be no change to performance. We could test it on some devices which don't support USM or force it into non-USM mode for the purposes of benchmarking but I'm unsure how useful that is.

@danhoeflinger
Copy link
Contributor

I do think that this can indirectly help with performance in our target cases by getting us closer to #1906. Switching the existing temporary storage that we have in buffers over to usm where it is available generally does provide a performance improvement. This PR improves the infrastructure to better enable using it elsewhere.

…fix error: the usages of __dpl_sycl::__no_init{} not quite correct

Signed-off-by: Sergey Kopienko <[email protected]>
…y __dpl_sycl::__no_init{} for sycl::access_mode::write

Signed-off-by: Sergey Kopienko <[email protected]>
… specify __dpl_sycl::__no_init{} for sycl::access_mode::write

Signed-off-by: Sergey Kopienko <[email protected]>
…n_scan.h - specify __dpl_sycl::__no_init{} for sycl::access_mode::write

Signed-off-by: Sergey Kopienko <[email protected]>
…n_scan.h - fix review comment: I think this shouldn't be no_init, we read before write.

Signed-off-by: Sergey Kopienko <[email protected]>
Copy link
Contributor

@mmichel11 mmichel11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a couple of compilation errors in the buffer fallback path. I think it would be good to manually test the following scenarios by temporarily modifying the macros / function returns:

  1. The most common case: _ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT=1, __use_USM_host_allocations=true
  2. USM device allocations but no host memory: _ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT=1, __use_USM_host_allocations=false. Testing on a CUDA / HIP device would also take this path.
  3. The buffer fallback case: _ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT=0. This is where we would probably find all issues related to access modes.

…fix compile error in the __result_and_scratch_storage for case when _ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT is undefined

Signed-off-by: Sergey Kopienko <[email protected]>
…fix compile error in the __result_and_scratch_storage for case when _ONEDPL_SYCL_UNIFIED_USM_BUFFER_PRESENT is undefined

Signed-off-by: Sergey Kopienko <[email protected]>
Signed-off-by: Sergey Kopienko <[email protected]>
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/introduce_access_mode_into_usm_or_buffer_accessor branch from ac67c14 to b634f44 Compare October 22, 2024 08:22
…parametrize struct __usm_or_buffer_accessor only by accessor

Signed-off-by: Sergey Kopienko <[email protected]>
@SergeyKopienko
Copy link
Contributor Author

@danhoeflinger, @mmichel11 Could you please take a look again?

Copy link
Contributor

@danhoeflinger danhoeflinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@MikeDvorskiy MikeDvorskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mmichel11 mmichel11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SergeyKopienko SergeyKopienko merged commit beb9167 into main Oct 22, 2024
22 checks passed
@SergeyKopienko SergeyKopienko deleted the dev/skopienko/introduce_access_mode_into_usm_or_buffer_accessor branch October 22, 2024 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants