Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SandboxVec][DAG] Build actual dependencies #111094

Merged
merged 1 commit into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions llvm/include/llvm/SandboxIR/Utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#ifndef LLVM_SANDBOXIR_UTILS_H
#define LLVM_SANDBOXIR_UTILS_H

#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/LoopAccessAnalysis.h"
#include "llvm/Analysis/MemoryLocation.h"
#include "llvm/Analysis/ScalarEvolution.h"
Expand Down Expand Up @@ -99,6 +100,13 @@ class Utils {
return false;
return *Diff > 0;
}

/// Equivalent to BatchAA::getModRefInfo().
static ModRefInfo
aliasAnalysisGetModRefInfo(BatchAAResults &BatchAA, const Instruction *I,
const std::optional<MemoryLocation> &OptLoc) {
return BatchAA.getModRefInfo(cast<llvm::Instruction>(I->Val), OptLoc);
}
};

} // namespace llvm::sandboxir
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@

#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/iterator_range.h"
#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/SandboxIR/Instruction.h"
#include "llvm/SandboxIR/IntrinsicInst.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/Interval.h"
Expand All @@ -47,6 +48,7 @@ class DGNode {
// TODO: Use a PointerIntPair for SubclassID and I.
/// For isa/dyn_cast etc.
DGNodeID SubclassID;
// TODO: Move MemPreds to MemDGNode.
/// Memory predecessors.
DenseSet<MemDGNode *> MemPreds;

Expand Down Expand Up @@ -86,13 +88,20 @@ class DGNode {
(!(II = dyn_cast<IntrinsicInst>(I)) || isMemIntrinsic(II));
}

/// \Returns true if \p I is fence like. It excludes non-mem intrinsics.
static bool isFenceLike(Instruction *I) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not in SandboxIR/Utils.h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved these functions here after introducing sandboxir::IntrinsicInst. These functions previously required accesss to LLVM IR, which was why they were in Utils.

IntrinsicInst *II;
return I->isFenceLike() &&
(!(II = dyn_cast<IntrinsicInst>(I)) || isMemIntrinsic(II));
}

/// \Returns true if \p I is a memory dependency candidate instruction.
static bool isMemDepNodeCandidate(Instruction *I) {
AllocaInst *Alloca;
return isMemDepCandidate(I) ||
((Alloca = dyn_cast<AllocaInst>(I)) &&
Alloca->isUsedWithInAlloca()) ||
isStackSaveOrRestoreIntrinsic(I);
isStackSaveOrRestoreIntrinsic(I) || isFenceLike(I);
tmsri marked this conversation as resolved.
Show resolved Hide resolved
}

Instruction *getInstruction() const { return I; }
Expand Down Expand Up @@ -159,8 +168,37 @@ class DependencyGraph {
/// The DAG spans across all instructions in this interval.
Interval<Instruction> DAGInterval;

std::unique_ptr<BatchAAResults> BatchAA;

enum class DependencyType {
RAW, ///> Read After Write
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read after write is more annoying to process than write before read, can we use WBR (or something similar) as opposed to RAW?

also maybe we should spell these out instead of using abbreviations? I mostly ask for consistency with other enum classes that are camel case instead of all uppercase (e.g. None instead of NONE)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I am using RAW and not WBR is because it's standard dependency terminology. I agree that WBR is easier to process, but let's stick to what's expected.

Yeah, these uppercase ones need to go. I will update them.

WAW, ///> Write After Write
RAR, ///> Read After Read
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we group RAR into None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but None has a different meaning, that the instructions are not accessing memory.

WAR, ///> Write After Read
CTRL, ///> Control-related dependencies, like with PHIs/Terminators
OTHER, ///> Currently used for stack related instrs
NONE, ///> No memory/other dependency
};
/// \Returns the dependency type depending on whether instructions may
/// read/write memory or whether they are some specific opcode-related
/// restrictions.
/// Note: It does not check whether a memory dependency is actually correct,
/// as it won't call AA. Therefore it returns the worst-case dep type.
static DependencyType getRoughDepType(Instruction *FromI, Instruction *ToI);

// TODO: Implement AABudget.
/// \Returns true if there is a memory/other dependency \p SrcI->DstI.
bool alias(Instruction *SrcI, Instruction *DstI, DependencyType DepType);

bool hasDep(sandboxir::Instruction *SrcI, sandboxir::Instruction *DstI);

/// Go through all mem nodes in \p SrcScanRange and try to add dependencies to
/// \p DstN.
void scanAndAddDeps(DGNode &DstN, const Interval<MemDGNode> &SrcScanRange);

public:
DependencyGraph() {}
DependencyGraph(AAResults &AA)
: BatchAA(std::make_unique<BatchAAResults>(AA)) {}

DGNode *getNode(Instruction *I) const {
auto It = InstrToNodeMap.find(I);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ template <typename T, typename IntervalType> class IntervalIterator {
}
IntervalIterator &operator--() {
// `I` is nullptr for end() when To is the BB terminator.
I = I != nullptr ? I->getPrevNode() : R.To;
I = I != nullptr ? I->getPrevNode() : R.bottom();
return *this;
}
IntervalIterator operator--(int) {
Expand Down Expand Up @@ -109,14 +109,16 @@ template <typename T> class Interval {
T *bottom() const { return To; }

using iterator = IntervalIterator<T, Interval>;
using const_iterator = IntervalIterator<const T, const Interval>;
iterator begin() { return iterator(From, *this); }
iterator end() {
return iterator(To != nullptr ? To->getNextNode() : nullptr, *this);
}
const_iterator begin() const { return const_iterator(From, *this); }
const_iterator end() const {
return const_iterator(To != nullptr ? To->getNextNode() : nullptr, *this);
iterator begin() const {
return iterator(From, const_cast<Interval &>(*this));
}
iterator end() const {
return iterator(To != nullptr ? To->getNextNode() : nullptr,
const_cast<Interval &>(*this));
}
/// Equality.
bool operator==(const Interval &Other) const {
Expand Down
133 changes: 121 additions & 12 deletions llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,9 @@

#include "llvm/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/SandboxIR/Utils.h"

using namespace llvm::sandboxir;
namespace llvm::sandboxir {

#ifndef NDEBUG
void DGNode::print(raw_ostream &OS, bool PrintDeps) const {
Expand Down Expand Up @@ -50,29 +51,135 @@ MemDGNodeIntervalBuilder::make(const Interval<Instruction> &Instrs,
cast<MemDGNode>(DAG.getNode(MemBotI)));
}

DependencyGraph::DependencyType
DependencyGraph::getRoughDepType(Instruction *FromI, Instruction *ToI) {
// TODO: Perhaps compile-time improvement by skipping if neither is mem?
if (FromI->mayWriteToMemory()) {
if (ToI->mayReadFromMemory())
return DependencyType::RAW;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is RAW more conservative than WAW?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are about the same in the sense that you have to force ordering for all RAW, WAW or WAR.

if (ToI->mayWriteToMemory())
return DependencyType::WAW;
} else if (FromI->mayReadFromMemory()) {
if (ToI->mayWriteToMemory())
return DependencyType::WAR;
if (ToI->mayReadFromMemory())
return DependencyType::RAR;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not else-if in the second outer if for FromI?

Why is ToI checked first for Read in the first-if and for write in the second, any implicit ordering? If ToI is not mem read/write this code will still go through all the checks? Can it both read and write memory, if so can we make it simpler, something along these lines where write takes precedence?

enum  AccessType {
   NONE,
   READ,
   WRITE,
}  FromMemType = NONE;

// If it could both read and write, write takes precedence
if (FromI->mayWriteToMemory()) 
  FromMemType = AccessType::WRITE;
else if  (FromI->mayReadFromMemory())
  FromMemType = AccessType::READ;

if (FromMemType  != NONE) {
   // Check Write First
    if (ToI->mayWriteToMemory())
       return FromMemType == AccessType::READ ? WAR : WAW;
    if (ToI->mayReadFromMemory())
       return FromMemType == AccessType::READ ? RAR : RAW;  
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not else-if in the second outer if for FromI?

else-if should work. DONE.

Why is ToI checked first for Read in the first-if and for write in the second, any implicit ordering?

The order matters in the second outer if because one of them is a RAR (no dependency) and the other one is a WAR which is a dependency. The order in the first outer if shouldn't matter.

If ToI is not mem read/write this code will still go through all the checks?

Currently yes, we could speed it up perhaps by checking mayReadOrWriteMemory() for both and skip some of the code if neither is memory. I will add a TODO for it to do this change after we have enough tests and some benchmarks for the DAG.

Can it both read and write memory,

Yes

if so can we make it simpler, something along these lines where write takes precedence?

Yeah, this should also work, but it seems to be doing the same number of checks while being more verbose (because of the enum declaration). I don't feel too strongly about keeping the original one though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, a comment that the order matters in the second case would resolve it.

if (isa<sandboxir::PHINode>(FromI) || isa<sandboxir::PHINode>(ToI))
return DependencyType::CTRL;
if (ToI->isTerminator())
return DependencyType::CTRL;
if (DGNode::isStackSaveOrRestoreIntrinsic(FromI) ||
DGNode::isStackSaveOrRestoreIntrinsic(ToI))
return DependencyType::OTHER;
return DependencyType::NONE;
}

static bool isOrdered(Instruction *I) {
auto IsOrdered = [](Instruction *I) {
if (auto *LI = dyn_cast<LoadInst>(I))
return !LI->isUnordered();
if (auto *SI = dyn_cast<StoreInst>(I))
return !SI->isUnordered();
if (DGNode::isFenceLike(I))
return true;
return false;
};
bool Is = IsOrdered(I);
assert((!Is || DGNode::isMemDepCandidate(I)) &&
"An ordered instruction must be a MemDepCandidate!");
return Is;
}

bool DependencyGraph::alias(Instruction *SrcI, Instruction *DstI,
DependencyType DepType) {
std::optional<MemoryLocation> DstLocOpt =
Utils::memoryLocationGetOrNone(DstI);
if (!DstLocOpt)
return true;
// Check aliasing.
assert((SrcI->mayReadFromMemory() || SrcI->mayWriteToMemory()) &&
"Expected a mem instr");
// TODO: Check AABudget
ModRefInfo SrcModRef =
isOrdered(SrcI)
? ModRefInfo::Mod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be conservative with ModRef?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm what do you mean ?

: Utils::aliasAnalysisGetModRefInfo(*BatchAA, SrcI, *DstLocOpt);
switch (DepType) {
case DependencyType::RAW:
case DependencyType::WAW:
return isModSet(SrcModRef);
case DependencyType::WAR:
return isRefSet(SrcModRef);
default:
llvm_unreachable("Expected only RAW, WAW and WAR!");
}
}

bool DependencyGraph::hasDep(Instruction *SrcI, Instruction *DstI) {
DependencyType RoughDepType = getRoughDepType(SrcI, DstI);
switch (RoughDepType) {
case DependencyType::RAR:
return false;
case DependencyType::RAW:
case DependencyType::WAW:
case DependencyType::WAR:
return alias(SrcI, DstI, RoughDepType);
case DependencyType::CTRL:
// Adding actual dep edges from PHIs/to terminator would just create too
// many edges, which would be bad for compile-time.
// So we ignore them in the DAG formation but handle them in the
// scheduler, while sorting the ready list.
return false;
case DependencyType::OTHER:
return true;
case DependencyType::NONE:
return false;
}
}

void DependencyGraph::scanAndAddDeps(DGNode &DstN,
const Interval<MemDGNode> &SrcScanRange) {
assert(isa<MemDGNode>(DstN) &&
"DstN is the mem dep destination, so it must be mem");
Instruction *DstI = DstN.getInstruction();
// Walk up the instruction chain from ScanRange bottom to top, looking for
// memory instrs that may alias.
for (MemDGNode &SrcN : reverse(SrcScanRange)) {
Instruction *SrcI = SrcN.getInstruction();
if (hasDep(SrcI, DstI))
DstN.addMemPred(&SrcN);
}
}

Interval<Instruction> DependencyGraph::extend(ArrayRef<Instruction *> Instrs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to assert that the new instructions are not in the DAG?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, Instrs can overlap with the existing DAG. This will become clear in a follow-up patch.

if (Instrs.empty())
return {};
// TODO: For now create a chain of dependencies.
Interval<Instruction> Interval(Instrs);
auto *TopI = Interval.top();
auto *BotI = Interval.bottom();
DGNode *LastN = getOrCreateNode(TopI);

Interval<Instruction> InstrInterval(Instrs);

DGNode *LastN = getOrCreateNode(InstrInterval.top());
// Create DGNodes for all instrs in Interval to avoid future Instruction to
// DGNode lookups.
MemDGNode *LastMemN = dyn_cast<MemDGNode>(LastN);
for (Instruction *I = TopI->getNextNode(), *E = BotI->getNextNode(); I != E;
I = I->getNextNode()) {
auto *N = getOrCreateNode(I);
N->addMemPred(LastMemN);
for (Instruction &I : drop_begin(InstrInterval)) {
auto *N = getOrCreateNode(&I);
// Build the Mem node chain.
if (auto *MemN = dyn_cast<MemDGNode>(N)) {
MemN->setPrevNode(LastMemN);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this into the if condition that follows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah that would make it less confusing. I will do that in a separate patch, since we are not touching this code in this patch.

if (LastMemN != nullptr)
LastMemN->setNextNode(MemN);
LastMemN = MemN;
}
LastN = N;
}
return Interval;
// Create the dependencies.
auto DstRange = MemDGNodeIntervalBuilder::make(InstrInterval, *this);
for (MemDGNode &DstN : drop_begin(DstRange)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can DstRange be empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it could be.

auto SrcRange = Interval<MemDGNode>(DstRange.top(), DstN.getPrevNode());
scanAndAddDeps(DstN, SrcRange);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not just :
scanAndAddDeps(DstN, Interval(DstN->getNextNode(), DstRange.bottom());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SrcRange spans the MemNodes from DstRange.top() to the one before DstN, which is [DstRange.top(), DstN->getPrevNode()]. So yeah we could just replace this with:

scanAndAddDeps(Dst, Interval(DstRange.top(), DstN->getPrevNode())`;

Let me make this change.

}

return InstrInterval;
}

#ifndef NDEBUG
Expand All @@ -95,3 +202,5 @@ void DependencyGraph::dump() const {
dbgs() << "\n";
}
#endif // NDEBUG

} // namespace llvm::sandboxir
Loading
Loading