Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix view_reduction to work with output variable coming from either private or shared memory #256

Merged
merged 3 commits into from
Oct 13, 2022

Conversation

bartgol
Copy link
Contributor

@bartgol bartgol commented Oct 12, 2022

Motivation

In case result came from shared memory (e.g., an entry of a view), race conditions caused wrong results. This PR does the same thing that was done in impl::parallel_reduce, namely uses an automatic local var, then copies back in the output var.

Related Issues

E3SM Stakeholder Feedback

Might help scream, even though there's already a fix there.

test_view_reduction<Real,false,true,16,3> (1.0/3.0,2,11);
test_view_reduction<Real, true,false,16,3> (1.0/3.0,2,11);
test_view_reduction<Real,false,false,16,3> (1.0/3.0,2,11);
test_view_reduction<Real, true,true,18,4> (1.0/3.0,2,11);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how 3 was not generating a compiler error. We don't allow packs of non-power-of-2 size. I could have looked into why this does not throw an error, but I figured I fixed it and call it a day.

@bartgol bartgol self-assigned this Oct 12, 2022
@bartgol bartgol added bugfix AT: AUTOMERGE Inform the autotester (AT) that it can merge this PR if reviewers approved, and tests pass kokkos labels Oct 12, 2022
@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 350
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH bartgol/fix-view-reduction
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 4c8d649
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA dbe3d1c
PR_LABELS bugfix;AT: AUTOMERGE;kokkos
PULLREQUESTNUM 256
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 447
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH bartgol/fix-view-reduction
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 4c8d649
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA dbe3d1c
PR_LABELS bugfix;AT: AUTOMERGE;kokkos
PULLREQUESTNUM 256
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 464
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH bartgol/fix-view-reduction
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 4c8d649
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA dbe3d1c
PR_LABELS bugfix;AT: AUTOMERGE;kokkos
PULLREQUESTNUM 256
TEST_REPO_ALIAS EKAT

Using Repos:

Repo: EKAT (E3SM-Project/EKAT)
  • Branch: bartgol/fix-view-reduction
  • SHA: 4c8d649
  • Mode: TEST_REPO

Pull Request Author: bartgol

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 350
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH bartgol/fix-view-reduction
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 4c8d649
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA dbe3d1c
PR_LABELS bugfix;AT: AUTOMERGE;kokkos
PULLREQUESTNUM 256
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 447
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH bartgol/fix-view-reduction
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 4c8d649
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA dbe3d1c
PR_LABELS bugfix;AT: AUTOMERGE;kokkos
PULLREQUESTNUM 256
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 464
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH bartgol/fix-view-reduction
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 4c8d649
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA dbe3d1c
PR_LABELS bugfix;AT: AUTOMERGE;kokkos
PULLREQUESTNUM 256
TEST_REPO_ALIAS EKAT

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS BEEN REVIEWED, BUT NOT ACCEPTED OR REQUIRES CHANGES

@E3SM-Autotester
Copy link
Collaborator

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@jgfouca
Copy link
Member

jgfouca commented Oct 13, 2022

@bartgol , note that I also needed to use local variables when using ExeSpaceUtils::parallel_reduce

Copy link
Contributor

@tcclevenger tcclevenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Should be able to add back to SCREAM now.

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ tcclevenger ]!

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@E3SM-Autotester E3SM-Autotester merged commit 1052776 into master Oct 13, 2022
@E3SM-Autotester E3SM-Autotester deleted the bartgol/fix-view-reduction branch October 13, 2022 17:09
@E3SM-Autotester
Copy link
Collaborator

Merge on Pull Request# 256: IS A SUCCESS - Pull Request successfully merged

@E3SM-Autotester E3SM-Autotester removed the AT: AUTOMERGE Inform the autotester (AT) that it can merge this PR if reviewers approved, and tests pass label Oct 13, 2022
@bartgol
Copy link
Contributor Author

bartgol commented Oct 13, 2022

@bartgol , note that I also needed to use local variables when using ExeSpaceUtils::parallel_reduce

Ah, it might be that impl::parallel_reduce needs a barrier, to avoid the race condition Andrew pointed out in the shoc small kernel PR in scream:

auto local_var = resullt
// perform reduction into local_var
result = local_var // thread A might execute this line *before* thread B executes the first line

I'll add a fix for that too.

@bartgol
Copy link
Contributor Author

bartgol commented Oct 13, 2022

Shoot, the AT beat me. I'll do a super quick separate PR

@ambrad
Copy link
Member

ambrad commented Oct 13, 2022

I didn't get a chance to review before this was merged, but I agree with Jim that this PR is not complete. parallel_reduce needs to be reworked, too.

@bartgol
Copy link
Contributor Author

bartgol commented Oct 13, 2022

Yeah, I'm working on it.

@jgfouca
Copy link
Member

jgfouca commented Oct 13, 2022

@bartgol , another thought, do you think it would be worth adding a test that would have exposed the prior deficiency?

@bartgol
Copy link
Contributor Author

bartgol commented Oct 13, 2022

@bartgol , another thought, do you think it would be worth adding a test that would have exposed the prior deficiency?

That's something I was thinking about, yes. I will add a manifactured test, and verify it would have failed before.

@ambrad
Copy link
Member

ambrad commented Oct 13, 2022

@bartgol @jgfouca another thing I wanted to discuss was my opinion that there should be a version of view_reduction that does not take result as an input, thus negating this issue of automatic variable vs team-global. One could argue that the design flaw was permitting result to contain a valid value on input to be accumulated. That flaw is now leading to team_barrier calls that are strictly needed only when result is a team-global variable.

@oksanaguba
Copy link

oksanaguba commented Oct 13, 2022

@bartgol what is this PR for? Or, what i really want to know, will there be EKAT changes to scream master soon? Asking because i then would need to pull them into my branch, do not want to miss something. thanks!

@bartgol
Copy link
Contributor Author

bartgol commented Oct 13, 2022

This PR requires no change in scream. The follow up one will require a small change in shoc_energy_integrals_impl.hpp. I'll let you know when that happens.

@oksanaguba
Copy link

Not sure I was clear -- please, let me know when you change head commit of EKAT submodule in scream master. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential problem with ExeSpaceUtils view_reduction and parallel_reduce
6 participants