[HW] A series of fixes #353

mp-17 · 2024-08-26T12:30:05Z

Description of PR that completes issue here...

Changelog

Fixed

Description of changes

Added

Description of changes

Changed

Description of changes

Checklist

Automated tests pass
Changelog updated
Code style guideline is observed

Please check our contributing guidelines before opening a Pull Request.

If LMUL_X has X > 1, Ara injects one reshuffle at a time for each register within Vn and V(n+X-1) that has an EEW mismatch. All these reshuffles are reshuffling different Vm with LMUL_1, but also the same register (Vn with LMUL_X) from the point of view of the hazard checks on the next instruction that has a dependency on Vn with LMUL_X. We cannot just inject one macro reshuffle since the registers between Vn and V(n+X-1) can have different encodings. So, we need finer-grain reshuffles that messes up the dependency tracking. For example, vst @, v0 (LMUL_8) will use the registers from v0 to v7. If they are all reshuffled, we will end up with 8 reshuffle instructions that will get IDs from 0 to 7. The store will then see a dependency on the reshuffle ID that targets v0 only. This is wrong, since if the store opreq is faster than the slide opreq once the v0-reshuffle is over, it will violate the RAW dependency. Not to mess this up, the safest and most suboptimal fix is to just wait in WAIT_IDLE after a reshuffle with LMUL > 1. There are many possible optimizations to this: 1) Check if, when LMUL > 1, we reshuffled more than 1 register. If we reshuffle 1 reg only, we can also skip the WAIT_IDLE. 2) Check if all the X registers need to be reshuffled (common case). If this is the case, inject a large reshuffle with LMUL_X only and skip WAIT_IDLE. 3) Not to wait until idle, instead of WAIT_IDLE we can inject the reshuffles starting from V(n+X-1) instead than Vn. This will automatically adjust the dependency check and will speed up a bit the whole operation.

Signed-off-by: Moritz Imfeld <[email protected]>

MaistoV and others added 19 commits June 27, 2024 13:40

Rename CSRs in ara dispatcher

8d4fd75

Stall Ara upon operations on vector CSRs

9591d0d

Change "errors" to "exceptions"

893956d

Extend and fix Ara exception reporting from VLSU

594ae68

Set vstart=0 for succesful vector instructions

46883c8

[hardware] Fix vstart handling in dispatcher

61cf0ab

[hardware] 🐛 Fix reshuffling bug in dispatcher

2c751e0

[hardware] 🐛 Fix eew_q update during reshuffle

be2248d

[hardware] 🐛 Fix reshuffle

880b60d

[hardware] 🐛 Consider LMUL when deciding if to reshuffle vd

7f95d00

[hardware] Bump CVA6

5995015

Add time-multiplexing for VCPOP and VFIRST

5bf20c0

Signed-off-by: Moritz Imfeld <[email protected]>

Mask Unit clean-up

eb35e0b

Signed-off-by: Moritz Imfeld <[email protected]>

[hardware] 🐛 Fix NP2 slide with unbalanced packets

0766174

[hardware] Check for LMUL legality only if corresponding vreg is used

c7f7621

[hardware] VMSBF does not go through VALU

edb3549

[apps] Add random insn generator for verification

60293d1

[hardware] 🐛 Fix legality checks for vmadc/vmsbc

e99e7b6

mp-17 mentioned this pull request Aug 26, 2024

Check whether we can access vs1 and vs2 in VMADC/VMSBC #120

Closed

[hardware] Add constraint on VLENB

765714c

mp-17 mentioned this pull request Aug 26, 2024

VRF address error #286

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HW] A series of fixes #353

[HW] A series of fixes #353

mp-17 commented Aug 26, 2024

[HW] A series of fixes #353

Are you sure you want to change the base?

[HW] A series of fixes #353

Conversation

mp-17 commented Aug 26, 2024

Changelog

Fixed

Added

Changed

Checklist