-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full SME(1) instruction support and STREAMING Groups #415
base: dev
Are you sure you want to change the base?
Conversation
#rerun tests |
531ebd0
to
7974237
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments and I agree with several of Alex's comments. I think it would be good to get the ARM SME/SVE loops as part of our functional verification checks to help test these new instructions. I assume it would have to be done somewhere private though (not sure if we already have that guarantee in the upcoming CI/CD pipelines)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't finished the review but posting comments to prevent overlaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOOTS of new instructions, well done for grinding through them. Bring on SAIL
4ad3b6e
to
aa40d88
Compare
91c4336
to
5945bae
Compare
Now outdated as STREAMING groups logic removed which was the only cause for slowdown.
|
0af7abc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just need to finish off resolve existing conversations for the sake of clarity
@@ -585,9 +590,14 @@ RegisterValue vecUMinP(srcValContainer& sourceValues) { | |||
const T* n = sourceValues[0].getAsVector<T>(); | |||
const T* m = sourceValues[1].getAsVector<T>(); | |||
|
|||
// Concatenate the vectors | |||
T temp[2 * I]; | |||
memcpy(temp, m, sizeof(T) * I); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't m
and n
be switched here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot, only updated maxP to be in-line with HW...
31a3e6b
to
c0b2316
Compare
…ssion test (B, H, S, D)
…ion test (B, H, S, D)
…n alias and regression tests (B, H, S, D)
…uctions and aliases and regression tests (B, H, S, D)
…regression tests.
1232bcc
c0b2316
to
1232bcc
Compare
This PR implements all available SME (version 1) instructions that are contained within LLVM 14.0.5. Specifically, this is Version 2021-06 of the Armv9-A A64 ISA.
No FP16 or BF16 instructions have been supported due to lacking C++17 types. All Quad-Word instruction variants have been emulated using 64-bit data-types.
In addition to this, new STREAMING_SVE and STREAMING_PREDICATE groups have been introduced (along with corresponding decode logic) to allow for a different pipeline / latency configuration for these instructions when SVE Streaming Mode (the context mode which SME instructions are executed in) is enabled. This can allow for a co-processor style implementation of SME to be implemented within SimEng; with additional latency / reduced throughput being configured to mimic an offload penalty, and different execution or LD/STR hardware being modelled for said co-processor compared to the main core.