-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of tag dispatching on current codebase #1239
Conversation
bd679bd
to
3335f24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check that the code can be compiled with both dpcpp and tbb backends
daa96fa
to
1622733
Compare
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_fpga.h
Outdated
Show resolved
Hide resolved
ce40383
to
19a0f87
Compare
926c100
to
2cf29f9
Compare
7a43fb9
to
810ff85
Compare
…ing tag first is more logical
…eview comment: why do we add ranges namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a hard work or everybody, it looks good! Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
86a59e6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor questions / feedback. Not yet a full review.
I tried to catch all the places where __parallel_tag<_IsVector>
is mixed with _ForwardIterator
though.
__histogram_general_registers_local_reduction(_ExecutionPolicy&& __exec, const sycl::event& __init_event, | ||
::std::uint16_t __work_group_size, _Range1&& __input, _Range2&& __bins, | ||
const _BinHashMgr& __binhash_manager) | ||
__histogram_general_registers_local_reduction(oneapi::dpl::__internal::__device_backend_tag, _ExecutionPolicy&& __exec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of adding the tag to the signature of internal implementation details beyond all the "decisions" which the tag participates? (general question, not specific to histogram)
For histogram in particular, it seems that any function past __parallel_histogram
has already made all the decisions it will make based on tag, and there is no alternative to oneapi::dpl::__internal::__device_backend_tag
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is there some decision that the tag extends as far as the policy extends?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added oneapi::dpl::__internal::__device_backend_tag
here to have guaranties that this code will work only for device
backend. It's something like "common practic" in this PR. But I agree with you, all decisions about tags has been made earlier in code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About alternatives - we also have
#if _ONEDPL_FPGA_DEVICE
struct __fpga_backend_tag : __device_backend_tag
{
};
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suppose there is a chance in the future we may want to specialize individual details based on FPGA vs general device tags. This is not a blocker for me, but in practice it does add an extra parameter to already long signature.
One could argue that we could omit this beyond current existing "decisions" the tag makes, and if we decide to specialize for FPGA in the future, we could then easily add it to the necessary calls to provide the tag information to the specialization point. Just raising the point, but I'll leave it up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I raised the point offline, but I do see some risk of "silent" miss-dispatching from this PR.
With a huge PR, and potential for small errors in function signatures for specialized tags, we could fall back silently to less performant implementations. The result would still build and pass correctness testing, but would be less performant. static_asserts
have been added to combat this and attempt to detect if these cases arise with the wrong tag appearing in the general code.
This isn't a blocker for me, but I wanted raise this in the PR to see if there are any other ideas to mitigate this risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With limited time, and an approval from a couple others, I haven't gone through every line in incredible detail, but I have looked through the whole PR. All my feedback has been addressed, explained, or an issue created.
With a merge of #1455, this generally looks good to me.
UPD: |
…random_access_iterator_t as unused anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of the other approvals from @rarutyun @MikeDvorskiy, with my
review, and inspection of the changes since their approval, this LGTM.
As mentioned offline, I think we should merge #1456 or something similar in the future, including some comments documenting the expectations for usage of the tag dispatching system. However, I don't see these as blockers for this PR, as they are not fixing existing bugs, but rather preventing potential pitfalls of future extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR based on prototype from @rarutyun
Source prototype has been placed here:
!!! ATTENTION !!!
in this PR we disable
for_loop
staff for hetero policies:for_loop
;for_loop_n
;for_loop_strided
;for_loop_n_strided
.Tag dispatching mechanism:
select_backend
function)__serial_backend_tag
,__tbb_backend_tag
,__omp_backend_tag
,__device_backend_tag
,__fpga_backend_tag
,is_vector
.Overall schema of tag dispatching:
Algorithm level - first level:
const auto __dispatch_tag = oneapi::dpl::__internal::__select_backend(__exec, ...);
on algorithm level;__dispatch_tag
into patterns;Pattern level - second level:
_BackendTag
from the fist__Tag
parameter type;_BackendTag{}
(instance of the backend tag into backend implementation.Host backend tags:
Hetero backend tags:
Types of dispatching tags (host tags):
Types of dispatching tags (hetero tags):
How we define
__par_backend_tag
:Typical changes in the code
Changes in pattern calls:
Functions with
enable_if_..._policy<...>
and/*parallel=*/::std::false_type
before:
after:
_IsVector
type astypename _Tag::__is_vector
Functions with
__enable_if_host_execution_policy_conditional
,__is_random_access_iterator_v
and/*parallel=*/::std::true_type
before:
after:
Functions with
enable_if_..._policy<...>
and/*parallel=*/::std::true_type
before:
after:
in these functions we move
class _IsVector
to first template param place.Functions with
__enable_if_device_execution_policy
before:
after:
Changes in the
oneDPL
host policy classesAs result of this work now we have more clear host policy classes:
All functions like
__allow_unsequenced
,__allow_vector
and__allow_parallel
has been removed from these classes as not required anymore.__select_backend()
functions__select_backend()
functions for host policies and iterators__select_backend()
functions for hetero policies and iterators__select_backend()
functions for hetero policies and ranges (in the namespace__ranges
)