Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip driver routine in GlobalVariableAnalysis #265

Merged
merged 3 commits into from
Apr 11, 2024
Merged

Conversation

awnawab
Copy link
Contributor

@awnawab awnawab commented Apr 2, 2024

Global variables imported at the driver layer passed down to the compute kernel as a subroutine argument should not be included in the GlobalVariableAnalysis. Rather, they should be handled via the DataOffloadTransformation.

Copy link

github-actions bot commented Apr 2, 2024

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/265/index.html

@@ -435,7 +435,7 @@ def test_global_variable_analysis(frontend, key, config, global_variable_analysi

assert set(scheduler.items) == set(expected_trafo_data) | {'global_var_analysis_data_mod#some_type'}
for item in scheduler.items:
if item == 'global_var_analysis_data_mod#some_type':
if item == 'global_var_analysis_data_mod#some_type' or item.config['role'] == 'driver':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we could remove '#driver' in expected_trafo_data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry, that's a much cleaner way of doing it, thanks!

@@ -274,6 +274,9 @@ def transform_subroutine(self, routine, **kwargs):
if 'successors' not in kwargs:
raise RuntimeError('Cannot apply GlobalVariableAnalysis without successors to store offload analysis data')

if kwargs['role'] == 'driver':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I misunderstood something, but why shouldn't we apply the GlobalVariableAnalysis to the driver?

I can see that we don't use this information right now for anything, but I don't understand why we should discard this information rather than not using this information in the subsequent transformation pass(es)?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A global variable that is imported only at the driver layer would have to be passed down to the kernel as a subroutine argument. In this case, from the point of view of the kernel, it stops being a global variable and behaves like any other argument, and we wouldn't need to add !$acc declare create clauses. Such variables should therefore be handled by the DataOffloadTransformation instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but from the point of view of the driver it is a global variable and I think the Analysis part should be as general as possible for potential future Transformation s that possibly need to know global variables from the point of view of the driver?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So on the driver we could imagine such a scenario:

subroutine driver()
use some_mod, only : some_var

!$acc data
some_var = ...
!$acc end data

end subroutine driver

The only "device" code in a driver routine would be inside an !$openacc/openmp data region, and we would have to apply data movement directives to some_var for that data region, just like we would for any other variable passed in to the data region. Module imports at the driver layer really do behave like any other variable passed in to the openacc data region/ device kernel. If a future Transformation were to use driver layer module imports in a way that is not covered by the DataOffloadTransformation, then that should be functionality we explicitly add.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle I agree with Michael's comment: An analysis pass shouldn't make unnecessary assumptions about the intended use of the data it collects.

Unfortunately, the current analysis implementation performs a reduction of the data collected by successors, and therefore the information what has been used in the driver and what stems from kernels called in the driver is no longer separable directly in the trafo_data. However, the data offload transformation already works only on the successors data, therefore I'm currently not seeing how this change actually has any practical implications?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TheGlobalVariableAnalysis on ecwam with scc-hoist fails at the driver layer in quite an obscure way.

I understood the GlobalVariableAnalysis to be targeted at variables imported from within device code. Variables imported only at the driver layer don't really fit this model, as they would be variables imported in host code that might potentially be used in device code.

That's why I think driver layer only imports fit the conceptual model of the DataOffloadTransformation much better than the GlobalVarAnalysis/Transformation, but of course there's no harm in keeping the analysis at the driver layer.

I'll investigate the ecwam issues further and report back once I have a fix/workaround. We can then potentially close this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the ecwam scc-hoist problem is partially solved in the bug fix PR #279. For the remaining issue, let's please discuss offline when I back on Thursday.

@awnawab
Copy link
Contributor Author

awnawab commented Apr 11, 2024

As discussed offline, I have moved the propagation of the offload requirements from the GlobalVarAnalysis to the GlobalVarOffloadTransformation. This combined with PR #279 works on ecWAM.

Copy link

codecov bot commented Apr 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.88%. Comparing base (338a042) to head (f720c11).
Report is 106 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #265      +/-   ##
==========================================
+ Coverage   92.78%   92.88%   +0.09%     
==========================================
  Files          96      102       +6     
  Lines       17965    18253     +288     
==========================================
+ Hits        16669    16954     +285     
- Misses       1296     1299       +3     
Flag Coverage Δ
lint_rules 96.39% <ø> (ø)
loki 92.85% <ø> (+0.07%) ⬆️
transformations 92.22% <100.00%> (+0.28%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@reuterbal reuterbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, very happy with this solution!

@reuterbal reuterbal added the ready for merge This PR has been approved and is ready to be merged label Apr 11, 2024
@reuterbal reuterbal merged commit 9aa9a37 into main Apr 11, 2024
12 checks passed
@reuterbal reuterbal deleted the naan-globalvar branch April 11, 2024 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for merge This PR has been approved and is ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants