-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NFC] Transfer host variable to sycl kernel to avoid using unsupported memory capabilities. #930
Conversation
92d256c
to
ec8c8f5
Compare
…d memory capabilities. Test checks if memory capabilities are supported in the host code, and then early return in the final run time for unsupported capabilities. When use const host variable MemoryScope in the sycl kernel directly, aot compile would fail in the compile time if MemoryScope is not supported. So this patch transfers host variable to sycl kernel to avoid using unsupported memory capabilities.
ec8c8f5
to
883be7c
Compare
@@ -226,11 +227,32 @@ class run_atomic_fence { | |||
sycl::buffer<bool> res_buf(&res, sycl::range<1>(1)); | |||
sycl::buffer<int> sync_buffer(&sync, sycl::range<1>(1)); | |||
sycl::buffer<int> data_buffer(&data, sycl::range<1>(1)); | |||
// Using the const host variable MemoryScope in the kernel directly | |||
// may cause compile fail for AOT build. We transfer MemoryScope to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a SYCL limitation or a compiler bug?
In https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:language.restrictions.kernels:
Variables with static storage duration that are odr-used inside a device function, must be either const or constexpr, and must also be either zero-initialized or constant-initialized.
So, why not using directlyMemoryOrder
instead of intermediateorder_write
and so on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @keryell , not a SYCL limitation or a compiler bug. The early return in the host code occurs in the run time. For aot compile, use MemoryOrder directly may cause compile failure in compile time if MemoryOrder is not supported by device. So the test may fail in build stage and fail to generate a binary to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the problem that the test tries to use an atomic fence order or scope that is not supported by the device? If that is the case, the test should use info::device::atomic_fence_order_capabilities
and info::device::atomic_fence_scope_capabilities
to determine the supported order and scope values, and it should avoid submitting the kernel to the device if the device doesn't support that fence order or scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmlueck, yes, there is early return https://github.com/KhronosGroup/SYCL-CTS/blob/SYCL-2020/tests/atomic_fence/atomic_fence.cpp#L197 in the original tests, the check works in the run time, but for aot compile it may compile fail in device build stage, so the test may fail to build final binary.
Co-authored-by: Ronan Keryell <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still some formatting issues to fix.
Thanks.
@haonanya you can see the change to do about the formatting on https://github.com/KhronosGroup/SYCL-CTS/actions/runs/10674975860?pr=930 |
@keryell , thanks for your patience! We didn't plan to submit the PR according to https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:optional-kernel-features:
|
I see. |
Test checks if memory capabilities are supported in the host code, and then early return in the final run time for unsupported capabilities. When use const host variable MemoryScope in the sycl kernel directly, aot compile will fail in the compile time if MemoryScope is not supported. So this patch transfers host variable to sycl kernel to avoid using unsupported memory capabilities.