-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atomic Reference Reorganization, main branch (2024.08.08.) #291
Merged
krasznaa
merged 5 commits into
acts-project:main
from
krasznaa:AtomicRefReorg-main-20240805
Sep 3, 2024
Merged
Atomic Reference Reorganization, main branch (2024.08.08.) #291
krasznaa
merged 5 commits into
acts-project:main
from
krasznaa:AtomicRefReorg-main-20240805
Sep 3, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
krasznaa
force-pushed
the
AtomicRefReorg-main-20240805
branch
2 times, most recently
from
August 9, 2024 13:03
1636292
to
076a570
Compare
krasznaa
commented
Aug 9, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to point out a few things for you Stephen, since I'm aware of a few weaknesses myself... 🤔
Each new class is meant to be used on just one "platform", and vecmem::device_atomic_ref is a typedef to the one that is appropriate in any given situation. The selection procedure is not correct yet, the "POSIX" version of the code never gets selected at the moment.
It is the same as vecmem::cuda::device_atomic_reference, with an additional include needed by HIP.
Not in a way yet which would allow the re-evaluation of the C++ compiler in client projects.
The SYCL implementation never needs __host__ or __device__. The "POSIX" implementation will only ever work in host code. The CUDA and HIP implementations are now included more generally using the __CUDACC__ and __HIPCC__ macros. At which point the vecmem::cuda::device_atomic_ref implementation had to be tweaked to only use __threadfence() when building device code.
krasznaa
force-pushed
the
AtomicRefReorg-main-20240805
branch
from
August 31, 2024 11:25
076a570
to
4e05698
Compare
stephenswat
reviewed
Sep 2, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, minor comments and questions.
Since the main declaration of vecmem::device_atomic_ref needs that enumeration as well.
krasznaa
force-pushed
the
AtomicRefReorg-main-20240805
branch
from
September 3, 2024 09:39
43f585a
to
e57da85
Compare
stephenswat
approved these changes
Sep 3, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a proposal for fixing the MSVC+CUDA compilation issue described in #288. As discussed already in #275, I wanted to split the implementation of
vecmem::device_atomic_ref
into multiple classes since a while. #288 was the trigger to make it happen.This PR introduces the following new classes:
vecmem::dummy_device_atomic_ref
: A fallback implementation, for host code that's not covered by something better. Does not actually do atomic operations.vecmem::posix_device_atomic_ref
: An implementation using GCC / Clang built-ins for atomic operation support in host code.vecmem::cuda::device_atomic_ref
: An implementation using the basic CUDA C atomic operations.vecmem::hip::device_atomic_ref
: The same asvecmem::cuda::device_atomic_ref
(is a typedef of the latter), with an additional include statement needed by HIP.vecmem::sycl::builtin_device_atomic_ref
: A typedef usingcl::sycl::atomic_ref
.vecmem::sycl::custom_device_atomic_ref
: A full implementation of an atomic reference, using "low level" SYCL atomic operations.The
vecmem/memory/device_atomic_ref.hpp
file now just chooses between these different implementations, based on various criteria, and sets upvecmem::device_atomic_ref
as a typedef to one of the above types, or to std::atomic_ref with host code when using C++20.As we discussed with @stephenswat, in the end the
VECMEM_SUPPORT_POSIX_ATOMIC_REF
test, which currently runs only during the build, will need to become something thatvecmem-config.cmake
would re-run for the host compiler in the context of the client project's build. But since some other flags will need to be moved to behave like that as well, I decided to do that in a future PR. And in this one I just figure out whether the host compiler supportsvecmem::posix_device_atomic_ref
, during the build of vecmem itself.I am still working on the code, but wanted to expose it to @stephenswat at this point.