-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip driver routine in GlobalVariableAnalysis
#265
Conversation
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/265/index.html |
@@ -435,7 +435,7 @@ def test_global_variable_analysis(frontend, key, config, global_variable_analysi | |||
|
|||
assert set(scheduler.items) == set(expected_trafo_data) | {'global_var_analysis_data_mod#some_type'} | |||
for item in scheduler.items: | |||
if item == 'global_var_analysis_data_mod#some_type': | |||
if item == 'global_var_analysis_data_mod#some_type' or item.config['role'] == 'driver': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we could remove '#driver'
in expected_trafo_data
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry, that's a much cleaner way of doing it, thanks!
@@ -274,6 +274,9 @@ def transform_subroutine(self, routine, **kwargs): | |||
if 'successors' not in kwargs: | |||
raise RuntimeError('Cannot apply GlobalVariableAnalysis without successors to store offload analysis data') | |||
|
|||
if kwargs['role'] == 'driver': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I misunderstood something, but why shouldn't we apply the GlobalVariableAnalysis
to the driver?
I can see that we don't use this information right now for anything, but I don't understand why we should discard this information rather than not using this information in the subsequent transformation pass(es)?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A global variable that is imported only at the driver layer would have to be passed down to the kernel as a subroutine argument. In this case, from the point of view of the kernel, it stops being a global variable and behaves like any other argument, and we wouldn't need to add !$acc declare create
clauses. Such variables should therefore be handled by the DataOffloadTransformation
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but from the point of view of the driver it is a global variable and I think the Analysis
part should be as general as possible for potential future Transformation
s that possibly need to know global variables from the point of view of the driver?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So on the driver we could imagine such a scenario:
subroutine driver()
use some_mod, only : some_var
!$acc data
some_var = ...
!$acc end data
end subroutine driver
The only "device" code in a driver routine would be inside an !$openacc/openmp
data region, and we would have to apply data movement directives to some_var
for that data region, just like we would for any other variable passed in to the data region. Module imports at the driver layer really do behave like any other variable passed in to the openacc data region/ device kernel. If a future Transformation
were to use driver layer module imports in a way that is not covered by the DataOffloadTransformation
, then that should be functionality we explicitly add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle I agree with Michael's comment: An analysis pass shouldn't make unnecessary assumptions about the intended use of the data it collects.
Unfortunately, the current analysis implementation performs a reduction of the data collected by successors, and therefore the information what has been used in the driver and what stems from kernels called in the driver is no longer separable directly in the trafo_data
. However, the data offload transformation already works only on the successors
data, therefore I'm currently not seeing how this change actually has any practical implications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TheGlobalVariableAnalysis
on ecwam with scc-hoist fails at the driver layer in quite an obscure way.
I understood the GlobalVariableAnalysis
to be targeted at variables imported from within device code. Variables imported only at the driver layer don't really fit this model, as they would be variables imported in host code that might potentially be used in device code.
That's why I think driver layer only imports fit the conceptual model of the DataOffloadTransformation
much better than the GlobalVarAnalysis/Transformation
, but of course there's no harm in keeping the analysis at the driver layer.
I'll investigate the ecwam issues further and report back once I have a fix/workaround. We can then potentially close this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: the ecwam scc-hoist problem is partially solved in the bug fix PR #279. For the remaining issue, let's please discuss offline when I back on Thursday.
This reverts commit 4026945.
As discussed offline, I have moved the propagation of the offload requirements from the |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #265 +/- ##
==========================================
+ Coverage 92.78% 92.88% +0.09%
==========================================
Files 96 102 +6
Lines 17965 18253 +288
==========================================
+ Hits 16669 16954 +285
- Misses 1296 1299 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, very happy with this solution!
Global variables imported at the driver layer passed down to the compute kernel as a subroutine argument should not be included in the
GlobalVariableAnalysis
. Rather, they should be handled via theDataOffloadTransformation
.