Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove GT_ADDR nodes #11057

Closed
mikedn opened this issue Sep 8, 2018 · 26 comments
Closed

Remove GT_ADDR nodes #11057

mikedn opened this issue Sep 8, 2018 · 26 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions hard-problem
Milestone

Comments

@mikedn
Copy link
Contributor

mikedn commented Sep 8, 2018

Similar to #10873 but about GT_ADDR nodes. Like GT_ASG nodes, GT_ADDR nodes have one fundamental issue: they change the semantics of their operand nodes.

A GT_LCL_VAR that is used by GT_ADDR is no longer a node that produces a value, it's a node that represents a location. It also affects GT_FIELD so you can have a chain like GT_FIELD(GT_ADDR(GT_FIELD(GT_ADDR(GT_FIELD(…))))) where the outer GT_FIELD is an indirection that produces a value while the inner ones just represent locations.

This approach results in various code complications, inefficiencies and bugs.

category:implementation
theme:ir
skill-level:expert
cost:extra-large

@mikedn
Copy link
Contributor Author

mikedn commented Sep 8, 2018

The JIT already uses GT_LCL_VAR_ADDR and GT_LCL_FLD_ADDR nodes after rationalization, the importer should produce these directly from IL ldloca & co. Hopefully that's not too complicated.

The question is what to do with GT_FIELD. Add GT_FIELD_ADDR? Make GT_FIELD always take and return addresses and have the importer add an extra GT_IND instead of waiting until morph?

@mikedn
Copy link
Contributor Author

mikedn commented Jan 24, 2019

@RussKeldorph I'm not quite sure what up-for-grabs really means but if it implies that the issue is perhaps more approachable then it might be misleading in this case.

Removing GT_ADDR completely tends to run into the same issues as GT_ASG removal. In particular, we need to be able to have struct typed GT_LCL_FLD nodes in the IR. And then, getting that strays into 1st class structs territory so it's double the fun and not exactly trivial.

BTW, you can also assign it to me since I have every intention of getting rid of GT_ASG/GT_ADDR/GT_LIST and whatever other legacy IR crap I run into. Though at the speed reviews progress that will probably take a year, if not more 😁

@RussKeldorph
Copy link
Contributor

RussKeldorph commented Jan 24, 2019

@mikedn How about now?

Maybe I'm being idealistic but I'd like to think we can offer problems of all difficulty levels under up-for-grabs, but if that isn't consistent with the accepted meaning of that label, I can stop.

@mikedn
Copy link
Contributor Author

mikedn commented Jan 24, 2019

Maybe I'm being idealistic but I'd like to think we can offer problems of all difficulty levels under up-for-grabs, but if that isn't consistent with the accepted meaning of that label, I can stop.

I don't know, as I already said, I'm not sure what the meaning of the label truly is. I guess that some kind of "hard" is perfectly fine for up-for-grabs. But when the work spans more than half of the JIT, interferes with other issues, requires multiple PRs to get it done then it's probably a bit too much.

Sometimes I wonder how I ended up doing this. I guess that after a couple of years of looking at the JIT codebase, pieces finally fell into place and this looks feasible. Otherwise I probably wouldn't have opened all these issues to begin with.

@GSPP
Copy link

GSPP commented Jan 24, 2019

@mikedn I wish you the best of luck in cleaning up these disgusting design mistakes! Really need to get rid of stuff like this to move the JIT forward.

@mikedn
Copy link
Contributor Author

mikedn commented Jan 24, 2019

@GSPP Thanks!

@mikedn
Copy link
Contributor Author

mikedn commented Jan 26, 2019

In particular, we need to be able to have struct typed GT_LCL_FLD nodes in the IR.

I should mention that preliminary work to support this is on going in dotnet/coreclr#21705. Just so it doesn't look as if this issue (or the GT_ASG one) is stagnating :)

@mikedn
Copy link
Contributor Author

mikedn commented Nov 19, 2019

A (hopefully complete) list of ADDR uses and some thoughts about how they can be eliminated:

  • Taking the address of a local variable - ADDR(LCL_VAR) (or ADDR(LCL_FLD) for fields of local variables). The obvious thing to do is to replace these with LCL_VAR_ADDR and LCL_FLD_ADDR. Potential problems in doing this:
    • Do we still want to have address taken variables that are not address exposed and thus tracked variables? That would require setting SSA numbers on these local address nodes but that's rather strange since the value they produce has nothing to do with SSA. IMO all address taken variables should be address exposed and all indirect local accesses recognized by IsLocalAddrExpr & co. should be converted to LCL_FLDs.
    • LCL_VAR nodes that reference address exposed variables have GTF_GLOB_REF set on them, even if they're used by an ADDR node.
      For a IND(ADDR(LCL_VAR)) tree, normal side effect propagation pushes GTF_GLOB_REF up the tree to the indir, which is what we want. But if there's no indir then the same side effect propagation poisons the tree with GTF_GLOB_REF even if the address of a variable can't really be aliased. This does result in CQ issues - one case I ran into is that call args that are address of locals can't be late args due the bogus side effect they carry.
      So local address nodes should never have GTF_GLOB_REF but then we need to ensure that indirections of such address do somehow get GTF_GLOB_REF if the variable is address exposed. Or as proposed above - such indirect local access trees should always be converted to LCL_FLDs.
  • Field address access - unfortunately ldlflda & co. are imported as ADDR(FIELD). The only thing that can be done is to introduce FIELD_ADDR. Given the short lifespan of FIELD nodes this shouldn't be a big problem, though some care may be needed around fgUnwrapProxy (is this still needed in .NET Core?) and R2R affected fields.
  • Array element address access - similar to fields, ldelema is imported as ADDR(INDEX). And of course, the solution is similar - introduce INDEX_ADDR. Except that one already exists (which should not be a big problem) and that array access is actually a bit more complicated, as we'll see below.
  • Array value access, especially when combined with struct types, is quite a problem, perhaps the worst problem associated with ADDR. INDEX gets expanded to IND(element address tree) which may seem rather innocuous. It's not because:
    • The IND node has an ArrayInfo out-of-band annotation attached to it. This makes changing/eliminating this node problematic as the annotation has to be preserved somehow.
    • Even if the element type is struct you still get an IND, not an OBJ. Likely because of difficulties caused by the above mentioned annotation, JIT code that needs an OBJ does not change the IND but instead piles an OBJ on top of it by means of ADDR - OBJ(ADDR(IND(element address)). To get rid of ADDR there's no choice but to change the IND but then we need to ensure that the annotation is preserved properly. Or just get rid of it, best I can tell this annotation can be replaced by a special field sequence.
    • And just to complicate things even more, the IND node is actually wrapped in a COMMA. That may be less of an issue as far as ADDR is concerned but it is a problem for ASG elimination. So we need to keep that in mind when making changes around array element access. It's probably best to sink the COMMA below the IND so we get IND(COMMA(ARR_BOUNDS_CHK, element address)).
  • (Hopefully) the last and most exotic ADDR use is to attach struct type information to nodes that lack it, by wrapping them in OBJ(ADDR(node)), even if the node doesn't even represent an addressable location. This happens with SIMD typed nodes that are used as call arguments - call morphing needs to know the exact struct type for ABI decisions and SIMD node do not carry that information. To get rid of ADDR we have the following options:
    • Add struct type information to SIMD nodes - probably overkill since the this information is only required when these nodes are used as call args
    • Add a new node that serves the same purpose as OBJ(ADDR(x)) - to carry struct type info. Might work but I have reservations about extra nodes that act like wrappers for annotation purposes, they require special care to avoid them becoming detached from the child node they're supposed to wrap (e.g. GTF_NO_CSE).
    • Include the necessary struct type info in GenTreeCall::Use so it does not need to be recovered from the argument node. This makes the most sense to me, its information that's provided by the call signature and as such it should be stored with the call node and not its arguments. The only drawback would be that it would make GenTreeCall::Use bigger by adding a pointer to it.

@xiangzhai
Copy link
Contributor

:mips-interest

@xiangzhai
Copy link
Contributor

xiangzhai commented Nov 25, 2019

Hi @mikedn

Testcase: JIT/HardwareIntrinsics/X86/Sse41/ConvertToVector128_r/ConvertToVector128_r.exe

Assertion failed to work for MIPS:

noway_assert(((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) ||
                 ((tree->gtFlags & GTF_GLOB_REF) != 0));

Workaround:

diff --git a/src/jit/morph.cpp b/src/jit/morph.cpp
index 1fc966a..dddaa6a 100644
--- a/src/jit/morph.cpp
+++ b/src/jit/morph.cpp
@@ -6244,7 +6244,7 @@ GenTree* Compiler::fgMorphField(GenTree* tree, MorphAddrContext* mac)
     }
 
     noway_assert(((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) ||
-                 ((tree->gtFlags & GTF_GLOB_REF) != 0));
+                 ((tree->gtFlags & GTF_GLOB_REF) != 0) MIPS64_ONLY(|| objRef->gtOper == GT_LCL_VAR));
 
     if (tree->gtField.gtFldMayOverlap)
     {

Request for comments.

Thanks,
Leslie Zhai

@mikedn
Copy link
Contributor Author

mikedn commented Nov 25, 2019

@xiangzhai If adding the LCL_VAR makes the assert pass it means that you have something like FIELD(LCL_VAR) and then FIELD should really have GTF_GLOB_REF set because there's generally no way to tell where the pointer stored in the local variable points to. So the real problem here is likely the missing GTF_GLOB_REF and changing the assert won't fix that.

You should check Compiler::gtNewFieldRef, that one sets GTF_GLOB_REF in most cases. The only exception involves trees like FIELD(ADDR(LCL_VAR)) so it's not clear why GTF_GLOB_REF would be missing. Can you provide a JIT dump?

Since it's MIPS specific I would guess that it's caused by other MIPS change you have done, that's possibly related to struct call parameters handling (does MIPS pass structs by reference or by value?)
Or, since this is a x86 intrinsics test, maybe something is not handled correctly in the intrinsic import code.

And this isn't probably directly related to this issue but yes, the removal of ADDR should make things simpler and safer. But it will take a while until this is done.

@xiangzhai
Copy link
Contributor

xiangzhai commented Nov 25, 2019

Hi @mikedn

Thanks for your teaching!

So the real problem here is likely the missing GTF_GLOB_REF

Thanks for pointing out the root cause!

Can you provide a JIT dump?

Sorry! There are lots of debugging output messages in JitFunctionTrace.log.

struct call parameters handling (does MIPS pass structs by reference or by value?)

By value:
pass-structs

Or, since this is a x86 intrinsics test ...

Yes, Sse2.IsSupported, for example, should directly return false for MIPS without trigging assertion failure.

\cc @QiaoVanke

Thanks,
Leslie Zhai

@mikedn
Copy link
Contributor Author

mikedn commented Nov 25, 2019

Sorry! There are lots of debugging output messages in JitDisasm.log.

I was referring to a JIT dump obtained by setting the environment variable COMPlus_JitDump to the method name that asserts. That would show the IR that is generating the assert. But the log also helps a bit - we can see that the method that is being compiled is <>c:<JoinInternal>b__37_0(struct,struct):this. This is a lambda expression used in one of the Path.JoinInternal methods - https://github.com/dotnet/coreclr/blob/a9f3fc16483eecfc47fb79c362811d870be02249/src/System.Private.CoreLib/shared/System/IO/Path.cs#L659

This means that:

  • the problem is unlike to be related to intrinsics as this method does not use intrinsics. At least not directly.
  • the method has struct parameters so it is very likely that the problem has to do with the handling of method parameters

Without the dump, I can only guess that the local variable is perhaps one of the struct parameters and that FIELD(LCL_VAR) should have been FIELD(ADDR(LCL_VAR)). But I don't know how ADDR could have disappeared along the way (or not being generated in the first place).

It might be useful to highlight a discrepancy between IL instructions and JIT's IR: in IL you have

IL_0000: ldarg.2
IL_0001: ldfld !0 valuetype System.ValueTuple`5<native int, int32, native int, int32, bool>::Item1

But in the JIT IR the FIELD node expects an address so the above IL is imported as FIELD(ADDR(LCL_VAR)).

@xiangzhai
Copy link
Contributor

I was referring to a JIT dump obtained by setting the environment variable COMPlus_JitDump to the method name that asserts.

Sorry! I uploaded wrong log ...

Here JitDump.log is.

Thanks,
Leslie Zhai

@mikedn
Copy link
Contributor Author

mikedn commented Nov 26, 2019

Thanks, the dump says:

Morphing args for 7.CALL:
Replacing address of implicit by ref struct parameter with byref:
               [000004] ------------              *  LCL_VAR   byref  V02 arg2         

Assert failure(PID 13339 [0x0000341b], Thread: 13339 [0x341b]): Assertion failed '((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) || ((tree->gtFlags & GTF_GLOB_REF) != 0)' in '<>c:<JoinInternal>b__37_0(struct,struct):this' (IL size 120)

It seems that you have "implicit by ref" parameter morphing enabled, even though you say that on MIPS struct args are passed by value. Perhaps you need to adjust the ifdef in fgMorphImplicitByRefArgs?

bool Compiler::fgMorphImplicitByRefArgs(GenTree* tree)
{
#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_)

    return false;

#else  // (_TARGET_AMD64_ && !UNIX_AMD64_ABI) || _TARGET_ARM64_

    bool changed = false;

Though as it is it will already reject a non x64/arm64 target so presumably you already changed this in some way.

@xiangzhai
Copy link
Contributor

Though as it is it will already reject a non x64/arm64 target so presumably you already changed this in some way.

Thanks for pointing out my fault:

diff --git a/src/jit/morph.cpp b/src/jit/morph.cpp
index 1fc966a..2b8a8ee 100644
--- a/src/jit/morph.cpp
+++ b/src/jit/morph.cpp
@@ -17911,7 +17911,7 @@ void Compiler::fgMarkDemotedImplicitByRefArgs()
  */
 bool Compiler::fgMorphImplicitByRefArgs(GenTree* tree)
 {
-#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_) && !defined(_TARGET_MIPS64_)
+#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_)
 
     return false;
 

Thanks,
Leslie Zhai

@xiangzhai
Copy link
Contributor

struct call parameters handling (does MIPS pass structs by reference or by value?)

By value:
pass-structs

Sorry! @QiaoVanke pointed out my fault: It is a special case for O32, but not mips64r2 N64. Short word: By Reference!

Thanks,
Leslie Zhai

@xiangzhai
Copy link
Contributor

You should check Compiler::gtNewFieldRef, that one sets GTF_GLOB_REF in most cases. The only exception involves trees like FIELD(ADDR(LCL_VAR)) so it's not clear why GTF_GLOB_REF would be missing.

@QiaoVanke fixed it:

diff --git a/src/jit/compiler.hpp b/src/jit/compiler.hpp
index 7c8660f..99176db 100644
--- a/src/jit/compiler.hpp
+++ b/src/jit/compiler.hpp
@@ -1192,7 +1192,7 @@ inline GenTree* Compiler::gtNewFieldRef(var_types typ, CORINFO_FIELD_HANDLE fldH
     {
         unsigned lclNum                  = obj->gtOp.gtOp1->gtLclVarCommon.gtLclNum;
         lvaTable[lclNum].lvFieldAccessed = 1;
-#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)
         // These structs are passed by reference; we should probably be able to treat these
         // as non-global refs, but downstream logic expects these to be marked this way.
         if (lvaTable[lclNum].lvIsParam)
@@ -1780,10 +1780,10 @@ inline void LclVarDsc::incRefCnts(BasicBlock::weight_t weight, Compiler* comp, R
 
             bool doubleWeight = lvIsTemp;
 
-#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)
             // and, for the time being, implict byref params
             doubleWeight |= lvIsImplicitByRef;
-#endif // defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#endif // defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)
 
             if (doubleWeight && (weight * 2 > weight))
             {

Thanks,
Leslie Zhai

@SingleAccretion
Copy link
Contributor

SingleAccretion commented Feb 2, 2022

Seeing as I will be working on this issue (and the ASG elimination that is actually much more straightforward and at this point is just a mechanical refactoring), it would be useful to detail the work that needs to be done and ways in which I am planning to approach it. As detailed above, we have a few uses of ADDR, and some are easier to get replace with equivalents than others.

  1. FIELD nodes: come in a few flavors.

  2. INDEX and array access: surprisingly simple, comes in two distinct flavors: before morph (GT_INDEX) and after morph.

    • INDEX, as noted above, can simply be replaced by OBJ/IND(INDEX_ADDR).
    • After morph, array address access tends to be represented as ADDR(IND(actual address expression)), where the IND has an important out-of-band ArrayInfo annotation which is used by VN to deduce aliasing properties. The proposed replacement for ADDR(IND(...)) is a new ARR_ADDR "annotation" node (like BOX/RUNTIMELOOKUP) that will carry what VN needs. We can also add a field sequence to this node, and be able to fold ADD(ARR_ADDR(arrAddr), CONST [FldSeq]) into ARR_ADDR(arrAddr + CONST) [FldSeq] (that today cannot be folded in a sane manner due to the fact there is an intermediate ADDR(IND))), but that is an optional optimization. Notably, there is code today in the compiler which tries to support exactly this type of folding (the ConstantIndex pseudo-field annotation) that is dead because of this ADDR(IND) wrapping.
    • This part of the work is now complete.
  3. OBJ(ADDR(SIMD))) otherwise known as impNormStructVal and Remove the OBJ(ADDR) wrapping created for calls by the importer #51569. After the great refactoring of call arguments by Jakob (JIT: Refactor call argument representation #67238, Remove GT_PUTARG_TYPE #68748), this simply requires:

  4. Other indirect access to locals: IND/OBJ/BLK. The end goal is to have LocalAddressVisitor transforming all such access into simple local nodes.

  5. "Calls with return buffers" use local address nodes to act as stores. These will not be simple to untangle, some preliminary work, based on the idea of keeping the calls STRUCT-typed in the frontend, can be seen here: https://github.com/dotnet/runtime/compare/main...SingleAccretion:runtime:Ret-Buf-Calls?expand=1.

  6. ADDR(LCL) as R-values are mostly already turned into LCL_ADDRs by local morph, there are only a few cases (I know of one in args morphing at least) where they're still used. Once all accesses to tracked locals are replaces with LCL_FLD equivalents, this will be a simple "search and replace".

  7. Delete GT_ADDR #84147.

@mikedn
Copy link
Contributor Author

mikedn commented Feb 2, 2022

I hope you're not just copying all the work I did in my fork, it sounds way too similar to that...

@SingleAccretion
Copy link
Contributor

I was very careful to only look at the issues and changes in this (and coreclr) repository. I think most (all?) of the above is just a restatement of what has been said already, in different threads.

@shushanhf
Copy link
Contributor

Hi, @SingleAccretion @BruceForstall @jkotas @jakobbotsch
I can supply several LoongArch64's cloud Linux accounts for you, If you need please email to [email protected] to get the account .

The CoreRoot for runtime will be updated every weekday here

@BruceForstall
Copy link
Member

@SingleAccretion We've been doing JIT work planning for .NET 8. Since you continue to contribute so much (thank you!) it would be useful to know if you have a plan for a set of significant work for the next year. Perhaps the description above (#11057 (comment)) covers it? (Is that description up-to-date?) Or, perhaps it would be useful to create a "meta-issue" (like #74873) to cover it?

@SingleAccretion
Copy link
Contributor

Since you continue to contribute so much (thank you!) it would be useful to know if you have a plan for a set of significant work for the next year.

I will continue working on this, and #10873 after that. So, I suppose nothing new at this point :).

Perhaps the description above (#11057 (comment)) covers it? (Is that description up-to-date?)

More or less. I have updated it to include the latest developments.

@BruceForstall
Copy link
Member

@SingleAccretion Thanks for the update.

So, I suppose nothing new at this point :).

I'm sure we can find something for you if you don't find something yourself first :)

@GSPP
Copy link

GSPP commented Oct 2, 2022

@SingleAccretion I also want to thank you. I follow this issue tracker and I frequently see your excellent work. 🚀

@teo-tsirpanis teo-tsirpanis removed the JitUntriaged CLR JIT issues needing additional triage label Apr 1, 2023
@teo-tsirpanis teo-tsirpanis modified the milestones: Future, 8.0.0 Apr 1, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions hard-problem
Projects
None yet
Development

No branches or pull requests

9 participants