-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Device tasks with coroutines #253
Merged
evaleev
merged 83 commits into
TESSEorg:master
from
devreal:ttg-device-support-master-coro
Feb 27, 2024
Merged
[WIP] Device tasks with coroutines #253
evaleev
merged 83 commits into
TESSEorg:master
from
devreal:ttg-device-support-master-coro
Feb 27, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
evaleev
force-pushed
the
ttg-device-support-master-coro
branch
from
April 20, 2023 19:27
cf9ed17
to
9bad08b
Compare
devreal
force-pushed
the
ttg-device-support-master-coro
branch
from
June 9, 2023 17:41
db6be87
to
91e13e8
Compare
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
The views on the device will be handled by the backend, not in ttg::make_view Signed-off-by: Joseph Schuchart <[email protected]>
[ttg|std]::span is not a good fit because it carries a size template argument that we don't care about but makes handling more difficult. Plus, we need to encode the scope for each span. Three scopes are available: 1) SyncIn: copy host data to device before invoking the kernel callable. 2) SyncOut: copy the device data back to the host before invoking the output callable. 3) Allocate: allocate but do not synchronize in or out. Both SyncIn and SyncOut will allocate sufficient memory on the device. Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
- TTG_HAVE_CUDA enables CUDA support: synchronizing the stream between the kernel call and the output call. - TTG_USE_CUDA_PREFETCH enables support for prefetching based on the view scope. Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
…ake outterm as an argument
…paration for suspendable tasks
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
… the key in the profiling info under the parsec driver
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
… std::atomic_ref in libc++)
…ressions to avoid pissing CUDA off
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
devreal
force-pushed
the
ttg-device-support-master-coro
branch
from
July 6, 2023 20:20
e0735a7
to
8362742
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
An early implementation of device tasks with coroutines, still WIP. Single-process runs should work, distributed runs need a patch in PaRSEC. Implemented only in the PaRSEC backend. See included unit tests for usage examples.