Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full SME(1) instruction support and STREAMING Groups #415

Open
wants to merge 48 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
1a00d24
Added STREAMING versions of relevant aarch64 instruction groups.
FinnWilkinson May 24, 2024
ec8b486
Removed un-used macros from AArch64 Instruction decode.
FinnWilkinson May 28, 2024
f5f348b
Moved aarch64 getGroup logic to instruction_decode.
FinnWilkinson May 28, 2024
ffacdc9
Moved riscv getGroup logic to instruction_decode.
FinnWilkinson May 28, 2024
966c0d7
Updated unit tests after changing getGroup logic.
FinnWilkinson May 28, 2024
5ba3677
Added new AArch64 groups to model config and updated integration test.
FinnWilkinson May 28, 2024
b6abd5f
Added streaming mode enabled helper functions.
FinnWilkinson May 28, 2024
a0078d9
Added STREAMING group logic to instruction_decode, and logic to chang…
FinnWilkinson May 29, 2024
b95d973
Fixed minor issues with new streaming groups and updated SME example …
FinnWilkinson May 30, 2024
a41225f
Re-wrote checkStreamingGroup function.
FinnWilkinson May 30, 2024
e1781d0
Added unit tests for new AArch64 STREAMING groups functionality.
FinnWilkinson May 31, 2024
54ebf7c
Updated aarch64 groups diagram in docs.
FinnWilkinson May 31, 2024
f3318e2
Added SME instruction FMOPS (S and D) support and regression tests.
FinnWilkinson Aug 13, 2024
81231dd
Added SME instruction SMOPA (S and D) support and regression tests.
FinnWilkinson Aug 13, 2024
7325805
Added SME instruction SMOPS (S and D) support and regression tests.
FinnWilkinson Aug 13, 2024
5708c56
Added SME instructions UMOPA and UMOPS (S and D) support and regressi…
FinnWilkinson Aug 13, 2024
6344315
Fix jenkins build error.
FinnWilkinson Aug 14, 2024
3724d50
Added SME instructions SUMOPA and SUMOPS (S and D) support and regres…
FinnWilkinson Aug 14, 2024
3626b37
Updated SUMOPA and SUMOPS tests.
FinnWilkinson Aug 14, 2024
2227a55
Added SME instructions USMOPA and USMOPS (S and D) support and regres…
FinnWilkinson Aug 14, 2024
565cef4
Fix jenkins build error pt2.
FinnWilkinson Aug 14, 2024
dd7ffe1
Implemented SME STR instruction and regression test.
FinnWilkinson Aug 14, 2024
40228df
Fixed execution logic for vertical ST1D and ST1W SME stores.
FinnWilkinson Aug 14, 2024
a55e45d
Implemented SME ST1B and ST1H (H and V) instruction logic.
FinnWilkinson Aug 14, 2024
b73ca9e
Implemented SME LD1B and LD1H (H and V) instruction logic.
FinnWilkinson Aug 15, 2024
9461680
Added SME LD1B and LD1H regression tests.
FinnWilkinson Aug 15, 2024
a3ba507
Updated ST1D and ST1W SME regression tests.
FinnWilkinson Aug 15, 2024
e9d4cf2
Added SME ST1B and ST1H regression tests.
FinnWilkinson Aug 15, 2024
faf54a7
Implemented SME MOVA (Tile to Vec, horizontal) instructions and regre…
FinnWilkinson Aug 15, 2024
594a5b8
Implemented SME MOVA (Tile to Vec, vertical) instructions and regress…
FinnWilkinson Aug 15, 2024
c3aed6d
Implemented SME MOV (Tile to Vec, vertical and horizontal) instructio…
FinnWilkinson Aug 15, 2024
a927b37
Implemented SME MOVA/MOV (Vec to Tile, vertical and horizontal) instr…
FinnWilkinson Aug 16, 2024
0869be6
Implemented SME LDR instruction and regression tests.
FinnWilkinson Aug 16, 2024
dca22ea
Implemented SME ADDHA and ADDVA (S and D) instructions and regression…
FinnWilkinson Aug 19, 2024
b585701
Updated ADDHA test to make more specific.
FinnWilkinson Aug 20, 2024
53959cf
Corrected ADDVA execution logic.
FinnWilkinson Aug 20, 2024
a6b61e7
Updated ADDVA test to make more specific.
FinnWilkinson Aug 20, 2024
857cd9b
Added SME MOVA (tile to vec, vec to tile) Quad-word instructions and …
FinnWilkinson Aug 20, 2024
882ce0a
Implemented SME ST1Q and LD1Q (V and H) instructions and regression t…
FinnWilkinson Aug 28, 2024
d33d1c1
Removed werror.
FinnWilkinson Sep 2, 2024
b5c4cda
NEON instruction logic fixes.
FinnWilkinson Oct 14, 2024
7b74b34
Attended PR comments.
FinnWilkinson Oct 29, 2024
790f3df
Switched order of concatonation for NEON UMAXP instruction to match H…
FinnWilkinson Nov 4, 2024
e1ab10c
Fixed LD1W (into ZA, 32-bit) buffer overflow error.
FinnWilkinson Nov 8, 2024
a39dd23
Removed STREAMING_SVE and STREAMING_PREDICATE groups and associated l…
FinnWilkinson Dec 5, 2024
611d607
Reverted docs aarch64 instruction groups image.
FinnWilkinson Dec 5, 2024
f3088d5
Fixed order of vector concat for NEON uminp.
FinnWilkinson Dec 16, 2024
1232bcc
Post rebase fixes.
FinnWilkinson Dec 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions configs/a64fx_SME.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,15 +80,15 @@ Ports:
- INT_DIV_OR_SQRT
5:
Portname: EAGA
Instruction-Support:
Instruction-Group-Support:
- LOAD
- STORE_ADDRESS
- INT_SIMPLE_ARTH_NOSHIFT
- INT_SIMPLE_LOGICAL_NOSHIFT
- INT_SIMPLE_CMP
6:
Portname: EAGB
Instruction-Support:
Instruction-Group-Support:
- LOAD
- STORE_ADDRESS
- INT_SIMPLE_ARTH_NOSHIFT
Expand All @@ -98,6 +98,7 @@ Ports:
Portname: BR
Instruction-Group-Support:
- BRANCH
# Define example SME unit
8:
Portname: SME
Instruction-Group-Support:
Expand Down
Binary file removed docs/sphinx/assets/instruction_groups.png
Binary file not shown.
1 change: 0 additions & 1 deletion src/include/simeng/Register.hh
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
#pragma once
#include <cstdint>
dANW34V3R marked this conversation as resolved.
Show resolved Hide resolved
#include <iostream>

namespace simeng {

Expand Down
6 changes: 6 additions & 0 deletions src/include/simeng/arch/aarch64/Architecture.hh
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,12 @@ class Architecture : public arch::Architecture {
/** Returns the current value of SVCRval_. */
uint64_t getSVCRval() const;

/** Returns if SVE Streaming Mode is enabled. */
bool isStreamingModeEnabled() const;

/** Returns if the SME ZA Register is enabled. */
bool isZARegisterEnabled() const;

/** Update the value of SVCRval_. */
void setSVCRval(const uint64_t newVal) const;

Expand Down
30 changes: 28 additions & 2 deletions src/include/simeng/arch/aarch64/InstructionGroups.hh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,33 @@ namespace simeng {
namespace arch {
namespace aarch64 {

/** The IDs of the instruction groups for AArch64 instructions. */
/** The IDs of the instruction groups for AArch64 instructions.
* Each new group must contain 14 entries to ensure correct group assignment and
* general functionality.
* Their order must be as follows:
* - BASE
* - BASE_SIMPLE
* - BASE_SIMPLE_ARTH
* - BASE_SIMPLE_ARTH_NOSHIFT
* - BASE_SIMPLE_LOGICAL
* - BASE_SIMPLE_LOGICAL_NOSHIFT
* - BASE_SIMPLE_CMP
* - BASE_SIMPLE_CVT
* - BASE_MUL
* - BASE_DIV_OR_SQRT
* - LOAD_BASE
* - STORE_ADDRESS_BASE
* - STORE_DATA_BASE
* - STORE_BASE
*
* An exception to the above is "Parent" groups which do not require the LOAD_*
* or STORE_* groups.
* "Parent" groups allow for easier grouping of similar groups that may have
* identical execution latencies, ports, etc. For example, FP is the parent
* group of SCALAR and VECTOR.
* In simulation, an instruction's allocated group will never be a "Parent"
* group; they are only used to simplify config file creation and management.
*/
namespace InstructionGroups {
const uint16_t INT = 0;
const uint16_t INT_SIMPLE = 1;
Expand Down Expand Up @@ -102,7 +128,7 @@ static constexpr uint8_t NUM_GROUPS = 88;
const std::unordered_map<uint16_t, std::vector<uint16_t>> groupInheritance_ = {
{InstructionGroups::ALL,
{InstructionGroups::INT, InstructionGroups::FP, InstructionGroups::SVE,
InstructionGroups::PREDICATE, InstructionGroups::SME,
InstructionGroups::SME, InstructionGroups::PREDICATE,
InstructionGroups::LOAD, InstructionGroups::STORE,
InstructionGroups::BRANCH}},
{InstructionGroups::INT,
Expand Down
14 changes: 12 additions & 2 deletions src/include/simeng/arch/aarch64/helpers/neon.hh
Original file line number Diff line number Diff line change
Expand Up @@ -568,9 +568,14 @@ RegisterValue vecUMaxP(srcValContainer& sourceValues) {
const T* n = sourceValues[0].getAsVector<T>();
const T* m = sourceValues[1].getAsVector<T>();

// Concatenate the vectors
jj16791 marked this conversation as resolved.
Show resolved Hide resolved
T temp[2 * I];
memcpy(temp, n, sizeof(T) * I);
memcpy(temp + (sizeof(T) * I), m, sizeof(T) * I);
// Compare each adjacent pair of elements
T out[I];
for (int i = 0; i < I; i++) {
out[i] = std::max(n[i], m[i]);
out[i] = std::max(temp[2 * i], temp[2 * i + 1]);
}
return {out, 256};
}
Expand All @@ -585,9 +590,14 @@ RegisterValue vecUMinP(srcValContainer& sourceValues) {
const T* n = sourceValues[0].getAsVector<T>();
const T* m = sourceValues[1].getAsVector<T>();

// Concatenate the vectors
T temp[2 * I];
memcpy(temp, n, sizeof(T) * I);
memcpy(temp + (sizeof(T) * I), m, sizeof(T) * I);

T out[I];
for (int i = 0; i < I; i++) {
out[i] = std::min(n[i], m[i]);
out[i] = std::min(temp[2 * i], temp[2 * i + 1]);
}
return {out, 256};
}
Expand Down
6 changes: 6 additions & 0 deletions src/lib/arch/aarch64/Architecture.cc
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,12 @@ void Architecture::setSVCRval(const uint64_t newVal) const {
SVCRval_ = newVal;
}

// 0th bit of SVCR register determines if streaming-mode is enabled.
bool Architecture::isStreamingModeEnabled() const { return SVCRval_ & 1; }

// 1st bit of SVCR register determines if ZA register is enabled.
bool Architecture::isZARegisterEnabled() const { return SVCRval_ & 2; }

} // namespace aarch64
} // namespace arch
} // namespace simeng
35 changes: 35 additions & 0 deletions src/lib/arch/aarch64/InstructionMetadata.cc
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,41 @@ InstructionMetadata::InstructionMetadata(const cs_insn& insn)
operands[2].access = CS_AC_READ;
operands[3].access = CS_AC_READ;
break;

case Opcode::AArch64_INSERT_MXIPZ_H_B:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_H_D:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_H_H:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_H_Q:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_H_S:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_V_B:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_V_D:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_V_H:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_V_Q:
[[fallthrough]];
case Opcode::AArch64_INSERT_MXIPZ_V_S:
// Need to add access specifiers
// although operands[0] should be READ | WRITE, due to the implemented
// decode logic for SME tile destinations, the register will be added as
// both source and destination with just WRITE access.
operands[0].access = CS_AC_WRITE;
operands[1].access = CS_AC_READ;
operands[2].access = CS_AC_READ;
break;
case Opcode::AArch64_LDR_ZA:
// Need to add access specifier
// although operands[0] should be READ | WRITE, due to the implemented
// decode logic for SME tile destinations, the register will be added as
// both source and destination with just WRITE access.
operands[0].access = CS_AC_WRITE;
break;
case Opcode::AArch64_ZERO_M: {
// Incorrect access type: All are READ but should all be WRITE
for (int i = 0; i < operandCount; i++) {
Expand Down
Loading
Loading