Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mlx5: Introduce data direct placement (DDP) over the DV API #1494

Merged
merged 4 commits into from
Nov 10, 2024

Commits on Nov 10, 2024

  1. Update kernel headers

    To commit: 8b36f7c3c661 ("RDMA/mlx5: Support OOO RX WQE consumption").
    
    Signed-off-by: Yishai Hadas <[email protected]>
    Yishai Hadas committed Nov 10, 2024
    Configuration menu
    Copy the full SHA
    1065b35 View commit details
    Browse the repository at this point in the history
  2. mlx5: Handle OOO WQE consumption in CQE generation

    Based on IB specification, the current code assumes that WQE buffers are
    consumed and CQEs are generated in-order.
    The WQE in-order consumption is not guaranteed when HW has to handle
    out-of-order (OOO) packets. HW may consume buffers OOO but generate CQEs
    in-order.
    
    When scatter2cqe is enabled, we must scatter the data to the correct WQE
    buffer. This also applies to WR IDs. Assuming incremental WQE indexes
    leads to incorrect WR IDs being returned to users.
    
    Therefore, we need to use WQE index field from CQE to access the WQE
    instead of assuming the WQE's order is the same as the CQE's and access
    the next WQE in the WQ in an incremental way.
    
    So, this is a preparation patch to support the WQE's OOO mode as will be
    introduced in the next patches from the series.
    
    Signed-off-by: Edward Srouji <[email protected]>
    Signed-off-by: Yishai Hadas <[email protected]>
    EdwardSro authored and Yishai Hadas committed Nov 10, 2024
    Configuration menu
    Copy the full SHA
    7e8db6f View commit details
    Browse the repository at this point in the history
  3. mlx5: Remove max_post assignment in create_qp()

    In create_qp(), RQ max_post value was redundantly overwritten after
    being calculated already as part of mlx5_calc_rq_size().
    
    This assumption that max_post is equal to wqe_cnt is not necessarily
    true in all cases, as will be changed in upcoming patches where the OOO
    RX support feature is introduced.
    
    The line was removed, and the calculation is now handled by
    mlx5_calc_rq_size() without overwriting its calculation.
    
    Signed-off-by: Edward Srouji <[email protected]>
    Signed-off-by: Yishai Hadas <[email protected]>
    EdwardSro authored and Yishai Hadas committed Nov 10, 2024
    Configuration menu
    Copy the full SHA
    e8eb3ef View commit details
    Browse the repository at this point in the history
  4. mlx5: Support OOO RX WQE consumption

    Add a new MLX5DV_QP_CREATE_OOO_DP flag, which allows WRs on the receiver
    side of a QP to be consumed out-of-order (OOO).
    Additionally, it permits the sender side to transmit messages without
    guaranteeing arrival order on the receiver side.
    
    When enabled, the flag ensures that the completion ordering of WRs
    remains unchanged, regardless of the consumption order of Receive WRs.
    RDMA Read and RDMA Atomic operations on the responder side continue to
    be executed in order, while the ordering of data placement for RDMA
    Write and Send operations is not guaranteed.
    
    The MLX5DV_QP_CREATE_OOO_DP flag must be set on both the sender and
    receiver sides of a QP, such as DCT and DCI, to allow the sender side to
    transmit messages without guaranteeing any arrival ordering on the
    receiver side.
    
    It is optional, and its availability must be queried via the application
    using mlx5dv_query_device() with a newly added
    MLX5DV_CONTEXT_MASK_OOO_RECV_WRS flag.
    
    Although enabling OOO on the QP is relevant in Init to RTR modification
    stage, the relevant flag is passed by the user on QP creation.
    This should be done because internally, when enabled on a QP, its RQ
    buffer size is double the user requested size if it's a cyclic
    implemented buffer (e.g. RC, UC, UD, etc.). This is to prevent a
    scenario where WQE overwrites may happen when WQEs are consumed OOO.
    
    If the Kernel or device does not support this feature, creating the QP
    with this flag will fail.
    
    Signed-off-by: Edward Srouji <[email protected]>
    Signed-off-by: Yishai Hadas <[email protected]>
    EdwardSro authored and Yishai Hadas committed Nov 10, 2024
    Configuration menu
    Copy the full SHA
    cfcfc7b View commit details
    Browse the repository at this point in the history