-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tage #443
base: dev
Are you sure you want to change the base?
Tage #443
Changes from 57 commits
ff665cf
e81673f
40e7709
6fa281d
31c871e
110c1c6
4b3617c
a55e292
17c9baf
508a2f4
e0f8121
af8d1a0
f9089e0
3e5b507
0445478
f49e538
e688a05
1f925ea
52f9688
6a286d3
d0cc56a
1c1b6ce
1b800c1
3aa7ca0
e525016
673fe87
ace2d59
416cc20
92e67a8
f051277
2a957cb
0980811
e5f52eb
37a1c34
e297e68
6dfa36c
4155ffc
601178b
5031d15
4ad630c
d2a651b
111a48f
c661294
e8683c1
00d198a
9f144e1
6f73f4e
03d807d
9afa83c
79a5c7f
e11742d
729de42
922ba4b
dd3053a
4e1717d
bff5fa8
22756e6
eba5447
768db53
14789a6
7dcbb16
c6dc5b5
f9da602
0ecdd6b
57f3575
674672f
800ce6f
b23429f
ac0d2cf
f9ecd29
f11661d
e00ec65
d87cc5d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -149,13 +149,13 @@ The Branch-Prediction section contains those options to parameterise the branch | |
The current options include: | ||
|
||
Type | ||
The type of branch predictor that is used, the options are ``Generic``, and ``Perceptron``. Both types of predictor use a branch target buffer with each entry containing a direction prediction mechanism and a target address. The direction predictor used in ``Generic`` is a saturating counter, and in ``Perceptron`` it is a perceptron. | ||
The type of branch predictor that is used, the options are ``Generic``, ``Perceptron``, and ``Tage``. Each of these types of predictor use prediction tables with each entry containing a direction prediction mechanism and a target address. The direction predictor used in ``Generic`` and ``TAGE`` is a saturating counter, and in ``Perceptron`` it is a perceptron. ``TAGE`` also uses a series of further, tagged prediction tables to provide predictions informed by greater branch histories. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is there a good reason behind using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is not. I've udpated to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like the creator uses all forms of capitalisation |
||
|
||
BTB-Tag-Bits | ||
The number of bits used to index the entries in the Branch Target Buffer (BTB). The number of entries in the BTB is obtained from the calculation: 1 << ``bits``. For example, a ``bits`` value of 12 would result in a BTB with 4096 entries. | ||
|
||
Saturating-Count-Bits | ||
Only needed for a ``Generic`` predictor. The number of bits used in the saturating counter value. | ||
Only needed for ``Generic`` and ``Tage`` predictors. The number of bits used in the saturating counter value. | ||
|
||
Global-History-Length | ||
The number of bits used to record the global history of branch directions. Each bit represents one branch direction. For ``PerceptronPredictor``, this dictates the size of the perceptrons (with each perceptron having Global-History-Length + 1 weights). | ||
|
@@ -164,7 +164,16 @@ RAS-entries | |
The number of entries in the Return Address Stack (RAS). | ||
|
||
Fallback-Static-Predictor | ||
Only needed for a ``Generic`` predictor. The static predictor used when no dynamic prediction is available. The options are either ``"Always-Taken"`` or ``"Always-Not-Taken"``. | ||
Only needed for ``Generic`` and ``Tage`` predictors. The static predictor used when no dynamic prediction is available. The options are either ``"Always-Taken"`` or ``"Always-Not-Taken"``. | ||
|
||
Tage-Table-Bits | ||
Only needed for a ``Tage`` predictor. The number of bits used to index entries in the tagged tables. The number of entries in each of the tagged tables is obtained from the calculation: 1 << ``bits``. For examples, a ``bits`` value of 12 would result in tagged tables with 4096 entries. | ||
|
||
Num-Tage-Tables | ||
Only needed for a ``Tage`` predictor. The number of tagged tables used by the predictor, in addition to a default prediction table (i.e., the BTB). Therefore, a value of 3 for ``Num-Tage-Tables`` would result in four total prediction tables: one BTB and three tagged tables. If no tagged tables are desired, it is recommended to use the ``GenericPredictor`` instead. | ||
|
||
Tage-Length | ||
Only needed for a ``Tage`` predictor. The number of bits used to tage the entries of the tagged tables. | ||
FinnWilkinson marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the "tage" in the latter sentence meant to be that or rather "tag" |
||
|
||
.. _l1dcnf: | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
#pragma once | ||
|
||
#include <cstdint> | ||
|
||
namespace simeng { | ||
/** A class for storing a branch history. Needed for cases where a branch | ||
* history of more than 64 bits is required. This class makes it easier to | ||
* access and manipulate large branch histories, as are needed in | ||
* sophisticated branch predictors. | ||
* | ||
* The bits of the branch history are stored in a vector of uint64_t values, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "vector" should be "array" |
||
* and their access/manipulation is facilitated by the public functions. */ | ||
|
||
class BranchHistory { | ||
public: | ||
BranchHistory(uint64_t size) : size_(size) { | ||
history_ = {0}; | ||
FinnWilkinson marked this conversation as resolved.
Show resolved
Hide resolved
|
||
for (uint8_t i = 0; i < (size_ / 64); i++) { | ||
history_.push_back(0); | ||
} | ||
} | ||
~BranchHistory() {}; | ||
FinnWilkinson marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
/** Returns the 'numBits' most recent bits of the branch history. Maximum | ||
* number of bits returnable is 64 to allow it to be provided in a 64-bit | ||
* integer. */ | ||
uint64_t getHistory(uint8_t numBits) { | ||
assert(numBits <= 64 && "Cannot get more than 64 bits without rolling"); | ||
assert(numBits <= size_ && | ||
"Cannot get more bits of branch history than " | ||
"the size of the history"); | ||
return (history_[0] & ((1ull << numBits) - 1)); | ||
} | ||
|
||
/** Returns 'numBits' of the global history folded over on itself to get a | ||
* value of size 'length'. The global history is folded by taking an | ||
* XOR hash with the overflowing bits to get an output of 'length' bits. */ | ||
uint64_t getFolded(uint8_t numBits, uint8_t length) { | ||
assert(numBits <= size_ && | ||
"Cannot get more bits of branch history than " | ||
"the size of the history"); | ||
uint64_t output = 0; | ||
|
||
uint64_t startIndex = 0; | ||
uint64_t endIndex = numBits - 1; | ||
|
||
while (startIndex <= numBits) { | ||
output ^= ((history_[startIndex / 64] >> startIndex) & | ||
((1ull << (numBits - startIndex)) - 1)); | ||
|
||
// Check to see if a second uint64_t value will need to be accessed | ||
if ((startIndex / 64) == (endIndex / 64)) { | ||
uint8_t leftOverBits = endIndex % 64; | ||
output ^= (history_[endIndex / 64] << (numBits - leftOverBits)); | ||
} | ||
startIndex += length; | ||
endIndex += length; | ||
} | ||
|
||
// Trim the output to the desired size | ||
output &= (1 << length) - 1; | ||
return output; | ||
} | ||
|
||
/** Adds a branch outcome ('isTaken') to the global history */ | ||
void addHistory(bool isTaken) { | ||
for (int8_t i = size_ / 64; i >= 0; i--) { | ||
history_[i] <<= 1; | ||
if (i == 0) { | ||
history_[i] |= ((isTaken) ? 1 : 0); | ||
} else { | ||
history_[i] |= (((history_[i - 1] & (1ull << 63)) > 0) ? 1 : 0); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this need the conditional statement? After doing the AND you could shift right by 63 to get your 0 or 1. Would be slightly fewer cycles and more understandable/readable in my eyes (you may disagree) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the conditional is needed here. Whats being loaded into the uint64 depends on where it is in the vector. All but the least-significant uint64s get the MSB of the next uint64 added as the LSB. But the least-significant uint64 gets isTaken added as the LSB. However, if I'm misunderstanding your Q LMK. |
||
} | ||
} | ||
} | ||
|
||
/** Updates the state of a branch that has already been added to the global | ||
* history at 'position', where 'position' is 0-indexed and starts from the | ||
* most recent history. I.e., to update the most recently added branch | ||
* outcome, 'position' would be 0. | ||
* */ | ||
void updateHistory(bool isTaken, uint64_t position) { | ||
if (position < size_) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we assert position being < size_ as above, or are there cases where this could "validly" be greater? For instance, if you are trying to update an entry that has been lost from the history because there have been too many branches in the meantime? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly as you say, I don't think that this should be an assert as the core may validly try to update a history that is no longer being tracked. The reason that we should allow this is to allow the pipeline not to need to know the size of the branch history. We're already ensuring that this doesn't cause problems with our if statement on 82. |
||
uint8_t vectIndex = position / 64; | ||
uint8_t bitIndex = position % 64; | ||
bool currentlyTaken = ((history_[vectIndex] & (1ull << bitIndex)) != 0); | ||
if (currentlyTaken != isTaken) { | ||
history_[vectIndex] ^= (1ull << bitIndex); | ||
} | ||
} | ||
} | ||
|
||
/** Removes the most recently added branch from the history */ | ||
void rollBack() { | ||
for (uint8_t i = 0; i <= (size_ / 64); i++) { | ||
history_[i] >>= 1; | ||
if (i < (size_ / 64)) { | ||
history_[i] |= (((history_[i + 1] & 1) > 0) ? (1ull << 63) : 0); | ||
} | ||
} | ||
} | ||
|
||
private: | ||
/** The number of bits of branch history stored in this branch history */ | ||
uint64_t size_; | ||
|
||
/** A vector containing the bits of the branch history. The bits are | ||
* arranged such that the most recent branches are stored in uint64_t at | ||
* index 0 of the vector, then the next most recent at index 1 and so forth. | ||
* Within each uint64_t, the most recent branches are recorded in the | ||
* least-significant bits. */ | ||
std::vector<uint64_t> history_; | ||
FinnWilkinson marked this conversation as resolved.
Show resolved
Hide resolved
|
||
}; | ||
|
||
} // namespace simeng |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some TX2 diagrams note it's use of a multi-history branch predictor. I assume this is TAGE-like so maybe apply this config update to the TX2 YAML as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that sounds like it would be. I've updated the TX2 config as well.