Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the missing live locals info to the new blocks in tree lowering #10976

Closed
wants to merge 1 commit into from

Conversation

a7ehuo
Copy link
Contributor

@a7ehuo a7ehuo commented Oct 23, 2020

When the tree is transformed into new blocks in fastpathAcmpHelper or lowerArrayStoreCHK, the live locals info is not passed on from the previous block to the newly created blocks. The missing live locals causes missing GC map for the objects.

Many thanks to @liqunl for finding the cause of the missing GC map.

Depends on:

Fixes #10869

@a7ehuo
Copy link
Contributor Author

a7ehuo commented Oct 23, 2020

@liqunl @andrewcraik Ready for review. Thanks!

@Leonardo2718
Copy link
Contributor

Leonardo2718 commented Oct 23, 2020

It's possible this PR also fixes #10365

@liqunl
Copy link
Contributor

liqunl commented Oct 23, 2020

This will only fix gc map issue caused by ValueType lowering. Any opt that adds new blocks to the CFG after GRA (jprofiling is another example) will have the same problem. The reason is that GRA runs liveness analysis, and setLiveLocals on codegen to indicate there is liveness info on blocks, such that codegen can use it to compute gc map. As more and more complicated opts are added to post-GRA, this problem may become more relevant. And it's hard for a developer to tell the invisible dependency between GRA and gc map computation.

Maybe we should run liveness analysis in codegen to get a correct and most-recent liveness info before computing gc map. @andrewcraik What do you think?

@andrewcraik
Copy link
Contributor

@liqunl I am not sure if liveness analysis will work once the optimizer is shutdown and the trees have started to be transformed ready for evaluation. You could experiment with running it since it seems reasonable, but I am not certain it will work and we may have to fix up the liveness. We need an issue to note all the places that need to be fixed.

@a7ehuo
Copy link
Contributor Author

a7ehuo commented Oct 24, 2020

@liqunl @andrewcraik For my learning development, I'm curious of how the current liveness analysis is done. Could you point me to where the code (which file or function) is? If liveness analysis is moved to codegen, would this PR patching up fastpathAcmpHelper and lowerArrayStoreCHK still be required?

@liqunl liqunl mentioned this pull request Oct 24, 2020
@liqunl
Copy link
Contributor

liqunl commented Oct 26, 2020

GRA computes the liveness here https://github.com/eclipse/omr/blob/master/compiler/optimizer/GlobalRegisterAllocator.cpp#L398-L447
I tried liveness in codegen by copying code from GRA (liqunl@24ca976) , but got a failure in my pbuild. I think it might be due to block liveness info not cleaned up. Could you try cleaning up liveness info before running liveness analysis in codegen, see if that works?

@andrewcraik andrewcraik added comp:jit project:MH Used to track Method Handles related work project:valhalla Used to track Project Valhalla related work bug labels Oct 26, 2020
@andrewcraik andrewcraik self-assigned this Oct 26, 2020
@a7ehuo
Copy link
Contributor Author

a7ehuo commented Oct 27, 2020

I tried setLiveLocals(NULL) for blocks and codegen along with the code liqunl/openj9@24ca976 that moves the liveness analysis to codegen. I ran into issues as below.

comp->getFlowGraph()->getStructure() could be NULL in codegen. It is required by the liveness analysis. Recomputing structure [1] in codegen doesn’t work. An assert in GC occurs [2]. If I skip the the liveness analysis when structure is NULL, there is another assert in GC [3], more or less the same assert.

[1]

   if (!comp->getFlowGraph()->getStructure())
      {
      comp->getFlowGraph()->setStructure(TR_RegionAnalysis::getRegions(comp));
      }
   // liveness analysis as GRA

[2]

18:10:32.046 0x19a00    j9mm.479    *   ** ASSERTION FAILED ** at /root/home/ahuo/src/openj9-openjdk-jdk11/openj9/runtime/gc_glue_java/ScavengerRootScanner.hpp:108: ((MM_StackSlotValidator(MM_StackSlotValidator::NOT_ON_HEAP, *slotPtr, stackLocation, walkState).validate(_env)))

[3]

15:57:24.890 0x19a00    j9mm.479    *   ** ASSERTION FAILED ** at /root/home/ahuo/src/openj9-openjdk-jdk11/openj9/runtime/gc_glue_java/ConcurrentMarkingDelegate.cpp:61: ((MM_StackSlotValidator(MM_StackSlotValidator::NOT_ON_HEAP, object, stackLocation, walkState).validate(data->env)))

@liqunl
Copy link
Contributor

liqunl commented Oct 27, 2020

@a7ehuo Thanks for the experiments. It turned out there are more problems than we thought in doing liveness info in codegen, I feel like it should be doable because GlobalLiveVariablesForGC will lower trees before doing liveness analysis, so the tree shape should be similar to the one in codegen. But we can look at this approach later when manually fixing liveness becomes too tedious or impossible.

@a7ehuo
Copy link
Contributor Author

a7ehuo commented Oct 27, 2020

Thanks for the example @liqunl. I added the following two changes during liveness analysis from LiveVariablesForGC.cpp in a7ehuo/openj9@0eab767 on top of liqunl/openj9@24ca976. I no longer see the above asserts. I'm doing more tests and see how it goes.

https://github.com/eclipse/openj9/blob/c18472815f59d96865211b54583153db9bceb351/runtime/compiler/optimizer/LiveVariablesForGC.cpp#L208-L215
https://github.com/eclipse/openj9/blob/c18472815f59d96865211b54583153db9bceb351/runtime/compiler/optimizer/LiveVariablesForGC.cpp#L269-L285

@a7ehuo
Copy link
Contributor Author

a7ehuo commented Oct 28, 2020

Looks like adding the above two changes does not solve the issue. Ran sanity.functional test with a7ehuo/openj9@0eab767, thej9mm.479 asserts from GC still happens intermittently.

@a7ehuo
Copy link
Contributor Author

a7ehuo commented Oct 28, 2020

@andrewcraik With the test/experiment that's done so far on moving the liveness analysis to codegen, there are more things to consider to make it work. Could you review this PR if we should take this manual fix in tree lowering for now? Thanks!

@andrewcraik
Copy link
Contributor

I am not sure if running/rerunning that analysis is the right fix. The tactical fix seems more appropriate for the time being.

@a7ehuo a7ehuo force-pushed the fix-missing-livelocals branch from aa425ff to c9b7b38 Compare November 5, 2020 15:24
@andrewcraik andrewcraik added the depends:omr Pull request is dependent on a corresponding change in OMR label Nov 5, 2020
When the tree is transformed into new blocks in fastpathAcmpHelper
or lowerArrayStoreCHK, the live locals info is not passed on
from the previous block to the newly created blocks.
The missing live locals causes missing GC map for the objects.

Depends on eclipse-omr/omr#5639

Fixes eclipse-openj9#10869

Signed-off-by: Annabelle Huo <[email protected]>
@a7ehuo a7ehuo force-pushed the fix-missing-livelocals branch from c9b7b38 to eb542cc Compare November 11, 2020 17:06
@andrewcraik andrewcraik removed their assignment Dec 2, 2020
@a7ehuo
Copy link
Contributor Author

a7ehuo commented Dec 9, 2020

Close this PR. The issue is addressed by #11214

@a7ehuo a7ehuo closed this Dec 9, 2020
@a7ehuo a7ehuo deleted the fix-missing-livelocals branch June 22, 2021 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug comp:jit depends:omr Pull request is dependent on a corresponding change in OMR project:MH Used to track Method Handles related work project:valhalla Used to track Project Valhalla related work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Intermittent crashes or hangs in JIT compiled code using ConcurrentHashMap with -XX:+EnableValhalla
4 participants