-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DEVICEPTR
annotations to data region in driver loop
#145
Conversation
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/145/index.html |
Codecov Report
@@ Coverage Diff @@
## main #145 +/- ##
========================================
Coverage 92.09% 92.09%
========================================
Files 89 90 +1
Lines 16537 16664 +127
========================================
+ Hits 15229 15347 +118
- Misses 1308 1317 +9
Flags with carried forward coverage won't be shown. Click here to find out more.
|
10d3945
to
30b440f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nicely done and useful addition, many thanks!
Only real remark is related to naming: I'd prefer a marginally more descriptive naming of that option all the way through from CMake to transformation constructor. Suggestion would be something like assume_deviceptr
but better variants do probably exist.
@@ -27,13 +27,17 @@ class DataOffloadTransformation(Transformation): | |||
---------- | |||
remove_openmp : bool | |||
Remove any existing OpenMP pragmas inside the marked region. | |||
deviceptr : bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little hard to judge from the brief option name what that implies. I would propose to follow the typical "start with a verb" naming scheme, e.g., by calling this assume_deviceptr
? That implies imho also the requirement this imposes on the calling code to handle the offload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconded, either assume_deviceptr
or use_deviceptr
or mark_as_deviceptr
(or similar). And ideally propagate change up to cmake layer (explicit is better than implicit 😉 )
if self.deviceptr: | ||
deviceptr = '' | ||
if inargs+outargs+inoutargs: | ||
deviceptr = f'deviceptr({", ".join(inargs+outargs+inoutargs)})' | ||
pragma = Pragma(keyword='acc', content=f'data {deviceptr}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.deviceptr: | |
deviceptr = '' | |
if inargs+outargs+inoutargs: | |
deviceptr = f'deviceptr({", ".join(inargs+outargs+inoutargs)})' | |
pragma = Pragma(keyword='acc', content=f'data {deviceptr}') | |
if self.deviceptr: | |
offload_args = inargs + outargs + inoutargs | |
if offload_args: | |
deviceptr = f' deviceptr({", ".join(offload_args)})' | |
else: | |
deviceptr = '' | |
pragma = Pragma(keyword='acc', content=f'data{deviceptr}') |
Purely a matter of taste, but I find this a little hard to read with the repeated contracted "sum".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconded (but yes, matter of taste), spaces around operators used to be a pylint rule (for a reason).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconded (but yes, matter of taste), spaces around operators used to be a pylint rule (for a reason).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I Agree with all of the previous suggestions, but otherwise GTG. I've not tested this explicitly with any regression tests, but might try CLOUDSC later myself. This should not stop this being merged tho, so GTG from me.
if self.deviceptr: | ||
deviceptr = '' | ||
if inargs+outargs+inoutargs: | ||
deviceptr = f'deviceptr({", ".join(inargs+outargs+inoutargs)})' | ||
pragma = Pragma(keyword='acc', content=f'data {deviceptr}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconded (but yes, matter of taste), spaces around operators used to be a pylint rule (for a reason).
if self.deviceptr: | ||
deviceptr = '' | ||
if inargs+outargs+inoutargs: | ||
deviceptr = f'deviceptr({", ".join(inargs+outargs+inoutargs)})' | ||
pragma = Pragma(keyword='acc', content=f'data {deviceptr}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconded (but yes, matter of taste), spaces around operators used to be a pylint rule (for a reason).
@@ -27,13 +27,17 @@ class DataOffloadTransformation(Transformation): | |||
---------- | |||
remove_openmp : bool | |||
Remove any existing OpenMP pragmas inside the marked region. | |||
deviceptr : bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconded, either assume_deviceptr
or use_deviceptr
or mark_as_deviceptr
(or similar). And ideally propagate change up to cmake layer (explicit is better than implicit 😉 )
Thanks for the feedback @mlange05 and @reuterbal! I'll implement your recommendations later today. @mlange05 re trying this on CLOUDSC, you can see an example of this in action here: https://github.com/ecmwf-ifs/ecwam/tree/naan-deviceptr. |
c968f52
to
b972d49
Compare
b972d49
to
219a470
Compare
@@ -27,13 +27,17 @@ class DataOffloadTransformation(Transformation): | |||
---------- | |||
remove_openmp : bool | |||
Remove any existing OpenMP pragmas inside the marked region. | |||
assume_deviceptr : bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring and actual argument are out of sync here now. Can we call the argument assume_deviceptr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another silly oversight. Thanks for spotting this!
scripts/loki_transform.py
Outdated
@@ -92,6 +92,8 @@ def cli(debug): | |||
help='Run transformation to insert custom data offload regions.') | |||
@click.option('--remove-openmp', is_flag=True, default=False, | |||
help='Removes existing OpenMP pragmas in "!$loki data" regions.') | |||
@click.option('--assume_deviceptr', is_flag=True, default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so far we used hyphen instead of underscore for options on the CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Corrected 👍
22d6e7a
to
a41f639
Compare
a41f639
to
9bc5142
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting for CI to confirm but looks good to me now. Many thanks for applying the renaming and apologies that this has turned into such a rat's tail of driving the renaming through the layers...
A small update to the
DataOffload
transformation to mark kernel array arguments as true device-pointers and skip address translation via the OpenACC map. This is needed when using the CUDA backend in the new FIELD_API. The new functionality has been tested here: https://github.com/ecmwf-ifs/ecwam/tree/naan-deviceptr.NB: This is stacked on top of #142.