Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Optionally Use GCNRPTrackers during scheduling #93090

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

jrbyrnes
Copy link
Contributor

@jrbyrnes jrbyrnes commented May 22, 2024

This is part of a series of PRs which enable using the AMDGPU/GCN RPTrackers during scheduling. I've split them up to (hopefully) make reviewing easier. For context see #88797 . Since this is the final PR in the series, any high level comments should go here.

This PR adds the scheduling changes to: maintain the GCNRPTrackers during scheduling, and use them when making per-instruction scheduling decisions.

This PR is for commits beginning at fbd0e54 and depends on #93088 and #93089

Copy link

github-actions bot commented May 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@jrbyrnes
Copy link
Contributor Author

Address review comments + rebase on top of d890dda

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp Outdated Show resolved Hide resolved
@jrbyrnes
Copy link
Contributor Author

Latest adds RP speculation by using the new interface in the GCNTrackers -- the review for this implementation, which has a more detailed git history for review purposes, is #93088

Rebased for 825dbbb

@jrbyrnes jrbyrnes marked this pull request as ready for review August 13, 2024 21:30
llvm/lib/Target/AMDGPU/GCNRegPressure.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/GCNRegPressure.cpp Outdated Show resolved Hide resolved
MaxPressure = max(MaxPressure, CurPressure);
for (const RegisterMaskPair &P : DeadDefs) {
Register Reg = P.RegUnit;
if (!Reg.isVirtual())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead defs increase pressure of physregs too?

Copy link
Contributor Author

@jrbyrnes jrbyrnes Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the functions used to maintain the current state of pressure / live regs (i.e. recede, and advance) ignore PhysRegs. So, if we try to track PhysRegs in our speculation queries, we will run into inconsistency issues. I think enabling PhysReg tracking should be a separate PR, perhaps required to turn on the trackers by default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently have a patch stuck for over a year due to the pressure tracker brokenly handling dead defs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this #76416 ?

Comment on lines +340 to +339
if (MO.isUndef())
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No isUndef check, it is redundant with the readsReg case for uses and broken of the def of subregister case

llvm/lib/Target/AMDGPU/GCNRegPressure.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/GCNRegPressure.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/GCNRegPressure.h Outdated Show resolved Hide resolved
@@ -414,6 +508,64 @@ void GCNUpwardRPTracker::recede(const MachineInstr &MI) {
assert(CurPressure == getRegPressure(*MRI, LiveRegs));
}

void GCNUpwardRPTracker::bumpUpwardPressure(const MachineInstr *MI,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really clear what this method does to the RP tracker object - modifies CurPressure and MaxPressure? What is the state of the object after the method call? From your code I only see one call against a temp object which may indicate that the state of a temp becomes undefined after a call. If so why don't make a non-method function returning max pressure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really clear what this method does to the RP tracker object - modifies CurPressure and MaxPressure? What is the state of the object after the method call?

This method modifies CurPressure and MaxPressure w.r.t the uses/defs in MI. It is meant to be used for RP Speculation, so MI is not required to be the next instruction in the LIS. Due to this requirement, we can't use recede which uses the implied program order in LIS
to calculate live lane masks.

From your code I only see one call against a temp object which may indicate that the state of a temp becomes undefined after a call. If so why don't make a non-method function returning max pressure?

Since we are calculting the impact on RP for a SchedCandidate, we don't want to commit the new pressure state to the tracker (hence why we use the temp object). However, calculating RP w.r.t some instruction feels like the responsibility of the tracker, and keeping it as a method is consistent with the generic implementation.

CurPressure.inc(Reg, LiveAfter, LiveBefore, *MRI);
}
MaxPressure = max(MaxPressure, CurPressure);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall algorithm is really cumbersome. Why do you need findUseBetween?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need findUseBetween?

For bumpUpwardPressure, we actually don't. Thanks for pointing this out. I used in an attempt to correct for a subreg liveness resulting in copying the generic implementation. This was caused by RegisterOperands::adjustLaneLiveness. This method sets the LaneMask of a use to the live lanes of the register at the position of the MI in the LIS. This adds lanes that are live at the MI, but not necessarily used by the MI. findUseBetween was meant to correct this, but this was not obvious and possibly incorrect. I've removed this and the call to and replaced adjustLaneLiveness with adjustDefLaneLiveness (for def lane handling).

@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented Sep 9, 2024

I spent a while looking RP speculation in the face of dead lanes in superreg defs. In the current code (as propagated from the generic trackers), we basically just ignore these lanes when calcuting pressure impacts for speculative RP. However, this is inaccurate. For example, if we have an instruction which defines a vreg_128 and but only subregs 1 and 2 are used, then bumpDownwardPressure will speculate that this vgpr def increase VGPR pressure by 2. Since those dead regs are not needed after the instruction, this RP change is consistent with RAs view of conflicts for positions after the instruction. However, during the instruction (before the dead slot), we temporarily increase VGPR pressure by 4 to provide registers for those dead lanes. However, this may mislead the scheduler, particularly for bottom up scheduling in which defs reduce RP. Encoding this into RPTracker's CurPressure doesn't seem to make much sense, as this is used to model the pressure after the instruction (whether top-down or bottom-up direction). So, I think we should probably introduce a separate GCNRegPressure field in the trackers to model any temporary increases to RP, and the scheduler can use this to more accurately set heuristics. That said, this should probably be addressed in separate PR.

@jrbyrnes
Copy link
Contributor Author

Ping

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 24, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Jeffrey Byrnes (jrbyrnes)

Changes

This is part of a series of PRs which enable using the AMDGPU/GCN RPTrackers during scheduling. I've split them up to (hopefully) make reviewing easier. For context see #88797 . Since this is the final PR in the series, any high level comments should go here.

This PR adds the scheduling changes to: maintain the GCNRPTrackers during scheduling, and use them when making per-instruction scheduling decisions.

This PR is for commits beginning at fbd0e54 and depends on #93088 and #93089


Patch is 38.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93090.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/GCNIterativeScheduler.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/GCNRegPressure.cpp (+275-24)
  • (modified) llvm/lib/Target/AMDGPU/GCNRegPressure.h (+80-25)
  • (modified) llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp (+131-23)
  • (modified) llvm/lib/Target/AMDGPU/GCNSchedStrategy.h (+58-2)
diff --git a/llvm/lib/Target/AMDGPU/GCNIterativeScheduler.cpp b/llvm/lib/Target/AMDGPU/GCNIterativeScheduler.cpp
index 061b0515031b1b..085eb8e37e3cd2 100644
--- a/llvm/lib/Target/AMDGPU/GCNIterativeScheduler.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNIterativeScheduler.cpp
@@ -480,7 +480,7 @@ void GCNIterativeScheduler::scheduleLegacyMaxOccupancy(
   LLVM_DEBUG(dbgs() << "Scheduling using default scheduler, "
                        "target occupancy = "
                     << TgtOcc << '\n');
-  GCNMaxOccupancySchedStrategy LStrgy(Context);
+  GCNMaxOccupancySchedStrategy LStrgy(Context, /*IsLegacyScheduler=*/true);
   unsigned FinalOccupancy = std::min(Occ, MFI->getOccupancy());
 
   for (int I = 0; I < NumPasses; ++I) {
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
index c83af729f501fe..46bb3365a32337 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
@@ -288,6 +288,102 @@ collectVirtualRegUses(SmallVectorImpl<RegisterMaskPair> &RegMaskPairs,
   }
 }
 
+/// Mostly copy/paste from CodeGen/RegisterPressure.cpp
+static LaneBitmask getRegLanes(ArrayRef<RegisterMaskPair> RegUnits,
+                               Register RegUnit) {
+  auto I = llvm::find_if(RegUnits, [RegUnit](const RegisterMaskPair Other) {
+    return Other.RegUnit == RegUnit;
+  });
+  if (I == RegUnits.end())
+    return LaneBitmask::getNone();
+  return I->LaneMask;
+}
+
+/// Mostly copy/paste from CodeGen/RegisterPressure.cpp
+static LaneBitmask getLanesWithProperty(
+    const LiveIntervals &LIS, const MachineRegisterInfo &MRI,
+    bool TrackLaneMasks, Register RegUnit, SlotIndex Pos,
+    LaneBitmask SafeDefault,
+    function_ref<bool(const LiveRange &LR, SlotIndex Pos)> Property) {
+  if (RegUnit.isVirtual()) {
+    const LiveInterval &LI = LIS.getInterval(RegUnit);
+    LaneBitmask Result;
+    if (TrackLaneMasks && LI.hasSubRanges()) {
+      for (const LiveInterval::SubRange &SR : LI.subranges()) {
+        if (Property(SR, Pos))
+          Result |= SR.LaneMask;
+      }
+    } else if (Property(LI, Pos)) {
+      Result = TrackLaneMasks ? MRI.getMaxLaneMaskForVReg(RegUnit)
+                              : LaneBitmask::getAll();
+    }
+
+    return Result;
+  }
+
+  const LiveRange *LR = LIS.getCachedRegUnit(RegUnit);
+  if (LR == nullptr)
+    return SafeDefault;
+  return Property(*LR, Pos) ? LaneBitmask::getAll() : LaneBitmask::getNone();
+}
+
+/// Mostly copy/paste from CodeGen/RegisterPressure.cpp
+/// Helper to find a vreg use between two indices [PriorUseIdx, NextUseIdx).
+/// The query starts with a lane bitmask which gets lanes/bits removed for every
+/// use we find.
+static LaneBitmask findUseBetween(unsigned Reg, LaneBitmask LastUseMask,
+                                  SlotIndex PriorUseIdx, SlotIndex NextUseIdx,
+                                  const MachineRegisterInfo &MRI,
+                                  const SIRegisterInfo *TRI,
+                                  const LiveIntervals *LIS,
+                                  bool Upward = false) {
+  for (const MachineOperand &MO : MRI.use_nodbg_operands(Reg)) {
+    if (MO.isUndef())
+      continue;
+    const MachineInstr *MI = MO.getParent();
+    SlotIndex InstSlot = LIS->getInstructionIndex(*MI).getRegSlot();
+    bool InRange = Upward ? (InstSlot > PriorUseIdx && InstSlot <= NextUseIdx)
+                          : (InstSlot >= PriorUseIdx && InstSlot < NextUseIdx);
+    if (InRange) {
+      unsigned SubRegIdx = MO.getSubReg();
+      LaneBitmask UseMask = TRI->getSubRegIndexLaneMask(SubRegIdx);
+      LastUseMask &= ~UseMask;
+      if (LastUseMask.none())
+        return LaneBitmask::getNone();
+    }
+  }
+  return LastUseMask;
+}
+
+/// Mostly copy/paste from CodeGen/RegisterPressure.cpp
+static LaneBitmask getLiveLanesAt(const LiveIntervals &LIS,
+                                  const MachineRegisterInfo &MRI,
+                                  bool TrackLaneMasks, Register RegUnit,
+                                  SlotIndex Pos) {
+  return getLanesWithProperty(
+      LIS, MRI, TrackLaneMasks, RegUnit, Pos, LaneBitmask::getAll(),
+      [](const LiveRange &LR, SlotIndex Pos) { return LR.liveAt(Pos); });
+}
+
+// Copy/paste from RegisterPressure.cpp (RegisterOperands::adjustLaneLiveness)
+static void adjustDefLaneLiveness(SmallVectorImpl<RegisterMaskPair> &Defs,
+                                  SlotIndex &Pos, const LiveIntervals &LIS,
+                                  const MachineRegisterInfo &MRI) {
+  for (auto *I = Defs.begin(); I != Defs.end();) {
+    LaneBitmask LiveAfter =
+        getLiveLanesAt(LIS, MRI, true, I->RegUnit, Pos.getDeadSlot());
+    // If the def is all that is live after the instruction, then in case
+    // of a subregister def we need a read-undef flag.
+    LaneBitmask ActualDef = I->LaneMask & LiveAfter;
+    if (ActualDef.none()) {
+      I = Defs.erase(I);
+    } else {
+      I->LaneMask = ActualDef;
+      ++I;
+    }
+  }
+}
+
 ///////////////////////////////////////////////////////////////////////////////
 // GCNRPTracker
 
@@ -343,17 +439,41 @@ void GCNRPTracker::reset(const MachineInstr &MI,
   MaxPressure = CurPressure = getRegPressure(*MRI, LiveRegs);
 }
 
-////////////////////////////////////////////////////////////////////////////////
-// GCNUpwardRPTracker
-
-void GCNUpwardRPTracker::reset(const MachineRegisterInfo &MRI_,
-                               const LiveRegSet &LiveRegs_) {
+void GCNRPTracker::reset(const MachineRegisterInfo &MRI_,
+                         const LiveRegSet &LiveRegs_) {
   MRI = &MRI_;
   LiveRegs = LiveRegs_;
   LastTrackedMI = nullptr;
   MaxPressure = CurPressure = getRegPressure(MRI_, LiveRegs_);
 }
 
+void GCNRPTracker::bumpDeadDefs(ArrayRef<RegisterMaskPair> DeadDefs) {
+  GCNRegPressure TempPressure = CurPressure;
+  for (const RegisterMaskPair &P : DeadDefs) {
+    Register Reg = P.RegUnit;
+    if (!Reg.isVirtual())
+      continue;
+    LaneBitmask LiveMask = LiveRegs[Reg];
+    LaneBitmask BumpedMask = LiveMask | P.LaneMask;
+    CurPressure.inc(Reg, LiveMask, BumpedMask, *MRI);
+  }
+  MaxPressure = max(MaxPressure, CurPressure);
+  CurPressure = TempPressure;
+}
+/// Mostly copy/paste from CodeGen/RegisterPressure.cpp
+LaneBitmask GCNRPTracker::getLastUsedLanes(Register RegUnit,
+                                           SlotIndex Pos) const {
+  return getLanesWithProperty(
+      LIS, *MRI, true, RegUnit, Pos.getBaseIndex(), LaneBitmask::getNone(),
+      [](const LiveRange &LR, SlotIndex Pos) {
+        const LiveRange::Segment *S = LR.getSegmentContaining(Pos);
+        return S != nullptr && S->end == Pos.getRegSlot();
+      });
+}
+
+////////////////////////////////////////////////////////////////////////////////
+// GCNUpwardRPTracker
+
 void GCNUpwardRPTracker::recede(const MachineInstr &MI) {
   assert(MRI && "call reset first");
 
@@ -414,6 +534,49 @@ void GCNUpwardRPTracker::recede(const MachineInstr &MI) {
   assert(CurPressure == getRegPressure(*MRI, LiveRegs));
 }
 
+void GCNUpwardRPTracker::bumpUpwardPressure(const MachineInstr *MI,
+                                            const SIRegisterInfo *TRI) {
+  assert(!MI->isDebugOrPseudoInstr() && "Expect a nondebug instruction.");
+
+  SlotIndex SlotIdx = LIS.getInstructionIndex(*MI).getRegSlot();
+
+  // Account for register pressure similar to RegPressureTracker::recede().
+  RegisterOperands RegOpers;
+
+  RegOpers.collect(*MI, *TRI, *MRI, true, /*IgnoreDead=*/true);
+  assert(RegOpers.DeadDefs.empty());
+  adjustDefLaneLiveness(RegOpers.Defs, SlotIdx, LIS, *MRI);
+  RegOpers.detectDeadDefs(*MI, LIS);
+
+  // Boost max pressure for all dead defs together.
+  // Since CurrSetPressure and MaxSetPressure
+  bumpDeadDefs(RegOpers.DeadDefs);
+
+  // Kill liveness at live defs.
+  for (const RegisterMaskPair &P : RegOpers.Defs) {
+    Register Reg = P.RegUnit;
+    if (!Reg.isVirtual())
+      continue;
+    LaneBitmask LiveAfter = LiveRegs[Reg];
+    LaneBitmask UseLanes = getRegLanes(RegOpers.Uses, Reg);
+    LaneBitmask DefLanes = P.LaneMask;
+    LaneBitmask LiveBefore = (LiveAfter & ~DefLanes) | UseLanes;
+
+    CurPressure.inc(Reg, LiveAfter, LiveAfter & LiveBefore, *MRI);
+    MaxPressure = max(MaxPressure, CurPressure);
+  }
+  // Generate liveness for uses.
+  for (const RegisterMaskPair &P : RegOpers.Uses) {
+    Register Reg = P.RegUnit;
+    if (!Reg.isVirtual())
+      continue;
+    LaneBitmask LiveAfter = LiveRegs[Reg];
+    LaneBitmask LiveBefore = LiveAfter | P.LaneMask;
+    CurPressure.inc(Reg, LiveAfter, LiveBefore, *MRI);
+  }
+  MaxPressure = max(MaxPressure, CurPressure);
+}
+
 ////////////////////////////////////////////////////////////////////////////////
 // GCNDownwardRPTracker
 
@@ -430,28 +593,44 @@ bool GCNDownwardRPTracker::reset(const MachineInstr &MI,
   return true;
 }
 
-bool GCNDownwardRPTracker::advanceBeforeNext() {
+bool GCNDownwardRPTracker::advanceBeforeNext(MachineInstr *MI,
+                                             bool UseInternalIterator,
+                                             LiveIntervals *TheLIS) {
   assert(MRI && "call reset first");
-  if (!LastTrackedMI)
-    return NextMI == MBBEnd;
-
-  assert(NextMI == MBBEnd || !NextMI->isDebugInstr());
+  SlotIndex SI;
+  const LiveIntervals *CurrLIS;
+  const MachineInstr *CurrMI;
+  if (UseInternalIterator) {
+    if (!LastTrackedMI)
+      return NextMI == MBBEnd;
+
+    assert(NextMI == MBBEnd || !NextMI->isDebugInstr());
+    CurrLIS = &LIS;
+    CurrMI = LastTrackedMI;
+
+    SI = NextMI == MBBEnd
+             ? CurrLIS->getInstructionIndex(*LastTrackedMI).getDeadSlot()
+             : CurrLIS->getInstructionIndex(*NextMI).getBaseIndex();
+  } else { //! UseInternalIterator
+    CurrLIS = TheLIS;
+    SI = CurrLIS->getInstructionIndex(*MI).getBaseIndex();
+    CurrMI = MI;
+  }
 
-  SlotIndex SI = NextMI == MBBEnd
-                     ? LIS.getInstructionIndex(*LastTrackedMI).getDeadSlot()
-                     : LIS.getInstructionIndex(*NextMI).getBaseIndex();
   assert(SI.isValid());
 
   // Remove dead registers or mask bits.
   SmallSet<Register, 8> SeenRegs;
-  for (auto &MO : LastTrackedMI->operands()) {
+  for (auto &MO : CurrMI->operands()) {
     if (!MO.isReg() || !MO.getReg().isVirtual())
       continue;
     if (MO.isUse() && !MO.readsReg())
       continue;
+    if (!UseInternalIterator && MO.isDef())
+      continue;
     if (!SeenRegs.insert(MO.getReg()).second)
       continue;
-    const LiveInterval &LI = LIS.getInterval(MO.getReg());
+    const LiveInterval &LI = CurrLIS->getInterval(MO.getReg());
     if (LI.hasSubRanges()) {
       auto It = LiveRegs.end();
       for (const auto &S : LI.subranges()) {
@@ -481,15 +660,22 @@ bool GCNDownwardRPTracker::advanceBeforeNext() {
 
   LastTrackedMI = nullptr;
 
-  return NextMI == MBBEnd;
+  return UseInternalIterator && (NextMI == MBBEnd);
 }
 
-void GCNDownwardRPTracker::advanceToNext() {
-  LastTrackedMI = &*NextMI++;
-  NextMI = skipDebugInstructionsForward(NextMI, MBBEnd);
+void GCNDownwardRPTracker::advanceToNext(MachineInstr *MI,
+                                         bool UseInternalIterator) {
+  if (UseInternalIterator) {
+    LastTrackedMI = &*NextMI++;
+    NextMI = skipDebugInstructionsForward(NextMI, MBBEnd);
+  } else {
+    LastTrackedMI = MI;
+  }
+
+  const MachineInstr *CurrMI = LastTrackedMI;
 
   // Add new registers or mask bits.
-  for (const auto &MO : LastTrackedMI->all_defs()) {
+  for (const auto &MO : CurrMI->all_defs()) {
     Register Reg = MO.getReg();
     if (!Reg.isVirtual())
       continue;
@@ -502,11 +688,17 @@ void GCNDownwardRPTracker::advanceToNext() {
   MaxPressure = max(MaxPressure, CurPressure);
 }
 
-bool GCNDownwardRPTracker::advance() {
-  if (NextMI == MBBEnd)
+bool GCNDownwardRPTracker::advance(MachineInstr *MI, bool UseInternalIterator,
+                                   LiveIntervals *TheLIS) {
+  if (UseInternalIterator && NextMI == MBBEnd)
     return false;
-  advanceBeforeNext();
-  advanceToNext();
+
+  advanceBeforeNext(MI, UseInternalIterator, TheLIS);
+  advanceToNext(MI, UseInternalIterator);
+  if (!UseInternalIterator) {
+    // We must remove any dead def lanes from the current RP
+    advanceBeforeNext(MI, true, TheLIS);
+  }
   return true;
 }
 
@@ -548,6 +740,65 @@ Printable llvm::reportMismatch(const GCNRPTracker::LiveRegSet &LISLR,
   });
 }
 
+void GCNDownwardRPTracker::bumpDownwardPressure(const MachineInstr *MI,
+                                                const SIRegisterInfo *TRI) {
+  assert(!MI->isDebugOrPseudoInstr() && "Expect a nondebug instruction.");
+
+  SlotIndex SlotIdx;
+  SlotIdx = LIS.getInstructionIndex(*MI).getRegSlot();
+
+  // Account for register pressure similar to RegPressureTracker::recede().
+  RegisterOperands RegOpers;
+  RegOpers.collect(*MI, *TRI, *MRI, true, /*IgnoreDead=*/false);
+  RegOpers.adjustLaneLiveness(LIS, *MRI, SlotIdx);
+
+  for (const RegisterMaskPair &Use : RegOpers.Uses) {
+    Register Reg = Use.RegUnit;
+    if (!Reg.isVirtual())
+      continue;
+    LaneBitmask LastUseMask = getLastUsedLanes(Reg, SlotIdx);
+    if (LastUseMask.none())
+      continue;
+    // The LastUseMask is queried from the liveness information of instruction
+    // which may be further down the schedule. Some lanes may actually not be
+    // last uses for the current position.
+    // FIXME: allow the caller to pass in the list of vreg uses that remain
+    // to be bottom-scheduled to avoid searching uses at each query.
+    SlotIndex CurrIdx;
+    const MachineBasicBlock *MBB = MI->getParent();
+    MachineBasicBlock::const_iterator IdxPos = skipDebugInstructionsForward(
+        LastTrackedMI ? LastTrackedMI : MBB->begin(), MBB->end());
+    if (IdxPos == MBB->end()) {
+      CurrIdx = LIS.getMBBEndIdx(MBB);
+    } else {
+      CurrIdx = LIS.getInstructionIndex(*IdxPos).getRegSlot();
+    }
+
+    LastUseMask =
+        findUseBetween(Reg, LastUseMask, CurrIdx, SlotIdx, *MRI, TRI, &LIS);
+    if (LastUseMask.none())
+      continue;
+
+    LaneBitmask LiveMask = LiveRegs[Reg];
+    LaneBitmask NewMask = LiveMask & ~LastUseMask;
+    CurPressure.inc(Reg, LiveMask, NewMask, *MRI);
+  }
+
+  // Generate liveness for defs.
+  for (const RegisterMaskPair &Def : RegOpers.Defs) {
+    Register Reg = Def.RegUnit;
+    if (!Reg.isVirtual())
+      continue;
+    LaneBitmask LiveMask = LiveRegs[Reg];
+    LaneBitmask NewMask = LiveMask | Def.LaneMask;
+    CurPressure.inc(Reg, LiveMask, NewMask, *MRI);
+  }
+  MaxPressure = max(MaxPressure, CurPressure);
+
+  // Boost pressure for all dead defs together.
+  bumpDeadDefs(RegOpers.DeadDefs);
+}
+
 bool GCNUpwardRPTracker::isValid() const {
   const auto &SI = LIS.getInstructionIndex(*LastTrackedMI).getBaseIndex();
   const auto LISLR = llvm::getLiveRegs(SI, LIS, *MRI);
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.h b/llvm/lib/Target/AMDGPU/GCNRegPressure.h
index 54dc1972d27619..463da472bb69ff 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.h
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.h
@@ -19,6 +19,7 @@
 
 #include "GCNSubtarget.h"
 #include "llvm/CodeGen/LiveIntervals.h"
+#include "llvm/CodeGen/RegisterPressure.h"
 #include <algorithm>
 
 namespace llvm {
@@ -149,6 +150,9 @@ inline GCNRegPressure operator-(const GCNRegPressure &P1,
   return Diff;
 }
 
+///////////////////////////////////////////////////////////////////////////////
+// GCNRPTracker
+
 class GCNRPTracker {
 public:
   using LiveRegSet = DenseMap<unsigned, LaneBitmask>;
@@ -165,7 +169,14 @@ class GCNRPTracker {
   void reset(const MachineInstr &MI, const LiveRegSet *LiveRegsCopy,
              bool After);
 
+  /// Mostly copy/paste from CodeGen/RegisterPressure.cpp
+  void bumpDeadDefs(ArrayRef<RegisterMaskPair> DeadDefs);
+
+  LaneBitmask getLastUsedLanes(Register RegUnit, SlotIndex Pos) const;
+
 public:
+  // reset tracker and set live register set to the specified value.
+  void reset(const MachineRegisterInfo &MRI_, const LiveRegSet &LiveRegs_);
   // live regs for the current state
   const decltype(LiveRegs) &getLiveRegs() const { return LiveRegs; }
   const MachineInstr *getLastTrackedMI() const { return LastTrackedMI; }
@@ -182,34 +193,45 @@ class GCNRPTracker {
 GCNRPTracker::LiveRegSet getLiveRegs(SlotIndex SI, const LiveIntervals &LIS,
                                      const MachineRegisterInfo &MRI);
 
+////////////////////////////////////////////////////////////////////////////////
+// GCNUpwardRPTracker
+
 class GCNUpwardRPTracker : public GCNRPTracker {
 public:
   GCNUpwardRPTracker(const LiveIntervals &LIS_) : GCNRPTracker(LIS_) {}
 
-  // reset tracker and set live register set to the specified value.
-  void reset(const MachineRegisterInfo &MRI_, const LiveRegSet &LiveRegs_);
+  using GCNRPTracker::reset;
 
-  // reset tracker at the specified slot index.
+  /// reset tracker at the specified slot index \p SI.
   void reset(const MachineRegisterInfo &MRI, SlotIndex SI) {
-    reset(MRI, llvm::getLiveRegs(SI, LIS, MRI));
+    GCNRPTracker::reset(MRI, llvm::getLiveRegs(SI, LIS, MRI));
   }
 
-  // reset tracker to the end of the MBB.
+  /// reset tracker to the end of the \p MBB.
   void reset(const MachineBasicBlock &MBB) {
     reset(MBB.getParent()->getRegInfo(),
           LIS.getSlotIndexes()->getMBBEndIdx(&MBB));
   }
 
-  // reset tracker to the point just after MI (in program order).
+  /// reset tracker to the point just after \p MI (in program order).
   void reset(const MachineInstr &MI) {
     reset(MI.getMF()->getRegInfo(), LIS.getInstructionIndex(MI).getDeadSlot());
   }
 
-  // move to the state just before the MI (in program order).
+  /// Move to the state of RP just before the \p MI . If \p UseInternalIterator
+  /// is set, also update the internal iterators. Setting \p UseInternalIterator
+  /// to false allows for an externally managed iterator / program order.
   void recede(const MachineInstr &MI);
 
-  // checks whether the tracker's state after receding MI corresponds
-  // to reported by LIS.
+  /// Mostly copy/paste from CodeGen/RegisterPressure.cpp
+  /// Calculate the impact \p MI will have on CurPressure and MaxPressure. This
+  /// does not rely on the implicit program ordering in the LiveIntervals to
+  /// support RP Speculation. It leaves the state of pressure inconsistent with
+  /// the current position
+  void bumpUpwardPressure(const MachineInstr *MI, const SIRegisterInfo *TRI);
+
+  /// \p returns whether the tracker's state after receding MI corresponds
+  /// to reported by LIS.
   bool isValid() const;
 
   const GCNRegPressure &getMaxPressure() const { return MaxPressure; }
@@ -223,6 +245,9 @@ class GCNUpwardRPTracker : public GCNRPTracker {
   }
 };
 
+////////////////////////////////////////////////////////////////////////////////
+// GCNDownwardRPTracker
+
 class GCNDownwardRPTracker : public GCNRPTracker {
   // Last position of reset or advanceBeforeNext
   MachineBasicBlock::const_iterator NextMI;
@@ -232,37 +257,67 @@ class GCNDownwardRPTracker : public GCNRPTracker {
 public:
   GCNDownwardRPTracker(const LiveIntervals &LIS_) : GCNRPTracker(LIS_) {}
 
+  using GCNRPTracker::reset;
+
   MachineBasicBlock::const_iterator getNext() const { return NextMI; }
 
-  // Return MaxPressure and clear it.
+  /// \p return MaxPressure and clear it.
   GCNRegPressure moveMaxPressure() {
     auto Res = MaxPressure;
     MaxPressure.clear();
     return Res;
   }
 
-  // Reset tracker to the point before the MI
-  // filling live regs upon this point using LIS.
-  // Returns false if block is empty except debug values.
+  /// Reset tracker to the point before the \p MI
+  /// filling \p LiveRegs upon this point using LIS.
+  /// \p returns false if block is empty except debug values.
   bool reset(const MachineInstr &MI, const LiveRegSet *LiveRegs = nullptr);
 
-  // Move to the state right before the next MI or after the end of MBB.
-  // Returns false if reached end of the block.
-  bool advanceBeforeNext();
-
-  // Move to the state at the MI, advanceBeforeNext has to be called first.
-  void advanceToNext();
-
-  // Move to the state at the next MI. Returns false if reached end of block.
-  bool advance();
-
-  // Advance instructions until before End.
+  /// Move to the state right before the next MI or after the end of MBB.
+  /// \p returns false if reached end of the block.
+  /// If \p UseInternalIterator is true...
[truncated]

Change-Id: I79b54722e6e858961486248d94766c3f3c161160
Change-Id: I6ae56149c1eb49ea85362267174cc6274c416330
Change-Id: I1cb0a88e94f4156da6118fcd3724556939351c6d
Change-Id: I198925f5ed91b0a49ac265e19fdbe2208139f09a
Change-Id: Ifa69110bf0a239ea14d25c0bad03215d1b018656
Change-Id: I9f0275a0cede9e77dfd29262124f2a856f436c8c
Change-Id: I74c19a2cf20d2325178933f81e0e8716d7c62f17
Change-Id: I09f9ca74c07b516daed0e93a85937df8b9aa922b
Change-Id: I5effce973fa2d945076e89b4453a844f0fc85fc9
Change-Id: I6217c03f56d34f584e5b23cf7c4462842bc7173b
Change-Id: Ibeaba6cab034636472b20c36adfadabbbc2c19ef
Change-Id: I96c99f12c59ef0eea86f7fbf134913ecc47dd6f2
Change-Id: I3e893ca2ffcf1032fe157b537c9563565215b123
Change-Id: I2de464b32d3c6ed9a77cbbc669d735dde63c2e47
…ressure + Use correct previous VGPR pressure
Change-Id: I286c9ed1ae91a68da881c6fa27f5f391102d0a9c
Change-Id: Ib7b21b2ab4cc44abc61fb8ad8880fb78f831619a
Change-Id: I3d0aae74f20927722cd6844b1d586ae7accab86e
Change-Id: I228916bf04add1de7615294d1e58ee4213f0bbde
Change-Id: I228916bf04add1de7615294d1e58ee4213f0bbde
Change-Id: I9ebe0cf7252068dcee90d419945085efae75547d
Change-Id: Ie204904f04dc9d2f53d586795c886a3f8c6b1268
Change-Id: I74c8ed0076ff8557d9a23a7ec7b1c9c00290be01
Change-Id: I9d6ba224894947f6e94df9ece3faf3f73d4e700f
@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented Oct 8, 2024

Force push to rebase for 5cb6b15

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/GCNRegPressure.h Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/GCNRegPressure.cpp Outdated Show resolved Hide resolved
Change-Id: I841ac8dd48b520397f96452f1d167613edf94895
@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented Oct 8, 2024

Latest passes psdb with GCNTrackers enabled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants