Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SW] Initial support for compilation in Linux environment #312

Closed
wants to merge 35 commits into from
Closed

Commits on Jun 19, 2024

  1. Rename CSRs in ara dispatcher

    MaistoV authored and mp-17 committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    665b6ae View commit details
    Browse the repository at this point in the history
  2. Stall Ara upon operations on vector CSRs

    MaistoV authored and mp-17 committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    193f385 View commit details
    Browse the repository at this point in the history
  3. Change "errors" to "exceptions"

    MaistoV authored and mp-17 committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    d5e2fb2 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e4c2466 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    2c9d5ba View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    26af2ec View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    7bcfcbf View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    7b0191a View commit details
    Browse the repository at this point in the history
  9. [hardware] 🐛 Suboptimal fix to reshuffle with LMUL > 1

    If LMUL_X has X > 1, Ara injects one reshuffle at a time for each register
    within Vn and V(n+X-1) that has an EEW mismatch.
    All these reshuffles are reshuffling different Vm with LMUL_1, but also
    the same register (Vn with LMUL_X) from the point of view of the hazard
    checks on the next instruction that has a dependency on Vn with LMUL_X.
    
    We cannot just inject one macro reshuffle since the registers between
    Vn and V(n+X-1) can have different encodings. So, we need finer-grain
    reshuffles that messes up the dependency tracking.
    
    For example,
    vst @, v0 (LMUL_8)
    will use the registers from v0 to v7. If they are all reshuffled, we
    will end up with 8 reshuffle instructions that will get IDs from
    0 to 7. The store will then see a dependency on the reshuffle ID that
    targets v0 only. This is wrong, since if the store opreq is faster than
    the slide opreq once the v0-reshuffle is over, it will violate the RAW
    dependency.
    
    Not to mess this up, the safest and most suboptimal fix is to just
    wait in WAIT_IDLE after a reshuffle with LMUL > 1.
    
    There are many possible optimizations to this:
     1) Check if, when LMUL > 1, we reshuffled more than 1 register.
    If we reshuffle 1 reg only, we can also skip the WAIT_IDLE.
     2) Check if all the X registers need to be reshuffled (common case).
    If this is the case, inject a large reshuffle with LMUL_X only and
    skip WAIT_IDLE.
     3) Not to wait until idle, instead of WAIT_IDLE we can inject the
    reshuffles starting from V(n+X-1) instead than Vn. This will automatically
    adjust the dependency check and will speed up a bit the whole operation.
    mp-17 committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    7d6da86 View commit details
    Browse the repository at this point in the history
  10. [hardware] 🐛 Fix reshuffle

    mp-17 committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    4733f20 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    a8426f3 View commit details
    Browse the repository at this point in the history
  12. [hardware] Bump CVA6

    mp-17 committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    2fed184 View commit details
    Browse the repository at this point in the history

Commits on Jun 25, 2024

  1. Refactoring addrgen

    MaistoV authored and mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    dd0047b View commit details
    Browse the repository at this point in the history
  2. Extensions and bug fixes

    * Add MMU interface (just mock)
    
    * Refactoring
    MaistoV authored and mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    b6363d2 View commit details
    Browse the repository at this point in the history
  3. Update submodules

    * Switch from pulp-platform/cva6 to MaistoV/cva6_fork
    
    * Bump axi to v0.39.0
    MaistoV authored and mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    fc8dd42 View commit details
    Browse the repository at this point in the history
  4. Supporting vstart CSR for operand read, VALU, VLSU

    * vstart support for vector unit-stride loads and stores
    
    * vstart support for vector strided loads and stores
    
    * vstart support for valu operations, mask operations not tested
    
    * Preliminary work on vstart support for vector indexed loads and stores
    
    * Minor fixes
    
    * Refactoring
    
    * Explanatory comments
    MaistoV authored and mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    310c4da View commit details
    Browse the repository at this point in the history
  5. tmp commit Adding MMU logic

    MaistoV authored and mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    9dd870f View commit details
    Browse the repository at this point in the history
  6. tmp commit (MMU stub)

    MaistoV authored and mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    d0a026a View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    37faafa View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    a3d46d2 View commit details
    Browse the repository at this point in the history
  9. [hardware] Support VLD and VST with vstart > 0

    - Restrict mem bus to EW if vstore, vstart > 0, and EW < 64-bit
    
    If vstart > 0 and EW < 64, the situation is similar to when the memory addr
    is misaligned wrt the memory bus. Because of the VRF Byte Layout and
    since the granularity of each lane's payload to the store unit is 64 bit,
    all the packets can contain valid data while we have not completed the
    beat. So, either we calculate in the addrgen the effective length of a
    bursts with unequal beats, or we add a buffer and aligner in the store
    unit, or we handle the ready signals at a byte level, or we simply reduce
    the effective memory bus to the element width (worst case).
    We do the latter. It's low performance, but vstore with vstart > 0 happen
    after an exception, so the throughput drop should be acceptable.
    
    - Data packets from VRF to STU
    
    Operand requesters now send balanced payloads from all the lanes
    if vstart > 0. The store unit will identify the good ones by itself,
    and will only have to handshake balanced payloads.
    mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    09b927f View commit details
    Browse the repository at this point in the history
  10. [hardware] 🐛 Flush st-opqueue and reset st-requester upon exception

    - Time the STU exception flush with the opqueues
    mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    a9411ea View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    929bcac View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    22031a1 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    62f6d5a View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    0532ffb View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    9be56e9 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    f12be75 View commit details
    Browse the repository at this point in the history
  17. [hardware] 🐛 Don't use vstart to drop elements for slides

    The vstart signal within the lanes is not the architectural
    vstart. For all the instructions, it corresponds to
    the architectural vstart manipulated to reflect the "vstart"
    in every lane for VRF fetch address calculation purposes.
    Memory instructions, which support arch vstart > 0, can use
    that vstart signal to resize the number of elements to fetch
    from the VRF. Slide instructions, instead, further modify
    the vstart only for addressing purposes, and should not use
    the vstart signal to resize the number of elements to fetch.
    mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    699a2c9 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    097049f View commit details
    Browse the repository at this point in the history
  19. [hardware] Fix verilator lint

    mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    404ce2f View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    9067762 View commit details
    Browse the repository at this point in the history
  21. [hardware] Fix rebase errors

    mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    11be9ed View commit details
    Browse the repository at this point in the history
  22. [Bender] Bump CVA6

    mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    e385500 View commit details
    Browse the repository at this point in the history
  23. [apps] Extended sw build for Linux

    * Added LINUX switch, default LINUX=0
    MaistoV authored and mp-17 committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    0a994f9 View commit details
    Browse the repository at this point in the history