-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable vectorized global loads for the reduction algorithms #1470
Merged
Merged
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
1f1ee5d
Dynamic number of items per work-item
julianmi eeb4120
Enable vectorization
julianmi 41fd20e
Code cleanup
julianmi 3630da8
Fix WG adjustment
julianmi c612f61
Restructure vectorized reduction
julianmi e029fd7
Add single item process path
julianmi f5e8271
Template vector width
julianmi 530e13d
Reduce branch divergence
julianmi d36acbc
Enable 32-bit addressing
julianmi 42c9aab
Fix merge issues
julianmi 556c184
Centralize tuning parameters
julianmi b8d067a
Cleanup diff
julianmi 5c9a43e
Fix CPU backend issue
julianmi 14c44df
Address review feedback
julianmi 6fc2581
Fix merge issue
julianmi 1db434d
Address review comments
julianmi ebd0da3
Remove another inline statement
julianmi f7ee7f1
Remove unintentional formatting changes
julianmi 2ef7d5b
Remove move statement and ::std
julianmi 17d5bce
Address review feedback
julianmi 291b1c6
Address review feedback
julianmi fdaf78d
Update is_device_copyable trait
julianmi 27a01f4
Update transform_reduce signature also in test
julianmi 70939d6
Add missing out-of-bounds check
julianmi 0245bbd
Improve bounds check based on review comments
julianmi 48bf347
Further bounds check improvements
julianmi 90c308b
Add check for shorter addressing support
julianmi 3d93a31
Use static assert instead
julianmi 3ebc344
Address review comments
julianmi 27fd437
Rename union storeage based on review discussion
julianmi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to lift this type definition out, we probably need to rename it as well. (trying to think of a good name...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally if we are going to lift this type definition out, we may cover the case when we have array of elements too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the array of elements, perhaps that reaches too far beyond the scope of this PR, but maybe something like
__delayed_ctor_storage
?I think we need something which describes its purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__optional_ctor_storage
?__lazy_ctor_storage
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how far we want to go in the context of this PR, but this trick is also used
https://github.com/oneapi-src/oneDPL/blob/a9aabb2b94020634f8f9961471f8af0a2aeb60ea/include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_radix_sort_one_wg.h#L169 and https://github.com/oneapi-src/oneDPL/blob/a9aabb2b94020634f8f9961471f8af0a2aeb60ea/include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_radix_sort.h#L597
If we are lifting this, it would be great to unify all the use to a single type. Then future improvements can be had by all, and it will improve readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the first is the array case Sergey was referring to, I'd be fine with leaving that one out for now to limit the scope of the PR if it makes it significantly more complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to make additional changes with it in some separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, for this PR, lets just rename it, we can unify, etc. in a separate PR.
My vote is for
__lazy_ctor_storage
because I think optional advertises more functionality than is provided here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this discussion. I agree that larger changes are outside the scope of this PR and change the naming to
__lazy_ctor_storage
.