Updates made as part of the LINKx PR series #10040
Replies: 1 comment 1 reply
-
@amirshehataornl Sorry for the late reply. Got busy. Let's unpack this one by one:
@shijin-aws Do you have thoughts on these? I would like to have a meeting with you both (and anyone else you think that should be a part of it of course) about all of these items. |
Beta Was this translation helpful? Give feedback.
-
As part of the LINKx development I have made some API changes. I would like to discuss these changes with the concerned parties to get alignment and move the changes along.
NOTE: I kept each of the API changes in its own PR so that they can be dealt with independently and can potentially land independently.
Address matching callback: #10039
Some providers (including CXI) do not always have the fi_addr_t to pass to the get_msg()/get_tag() calls. In order for them to do that then they will need to do a reverse lookup which is not efficient. I do not think the peer infrastructure should introduce this requirement. The general design of libfabric made it a point not to add this requirement as apparent by the need for FI_SOURCE. From the man page:
FI_SOURCE
Requests that the endpoint return source addressing data as part of its completion data. This capability only applies to connectionless endpoints. Note that returning source address information may require that the provider perform address translation and/or look-up based on data available in the underlying protocol in order to provide the requested data, which may adversely affect performance. The performance impact may be greater for address vectors of type FI_AV_TABLE.
Therefore, I propose the addition of an address matching callback which the parent provider would call. The idea here is that when the parent provider attempts to do address matching it can just use the native provider address to do the matching.
Add fi_peer_match to pass to the get_msg()/get_tag() callbacks: #10028
now that we need to pass more information in the get_msg()/get_tag() callbacks in order to be able to do proper address matching, it makes more sense to add a structure to pass to these callbacks. Currently I just add all the fields in the structure, but we can potentially unionize the structure in order to make it clearer what fields are being used. This pull request adds the fi_peer_match structure and makes the necessary updates in the code to accommodate this change.
Add memory registration callback to fi_ops_srx_peer: #10029
The CXI provider (and potentially other providers as well) register the receive buffers with the NIC on the receive path before data can be copied into these buffers. They then add the mapped addresses to the cache in order not to repeatedly do the expensive pinning/unpinning operations. This logic happens when the receive request is submitted. With the peer infrastructure this registration of the receive buffers is delayed until a message is processed by the provider and get_msg()/get_tag() is called. Through experimentation it has been observed that delaying the registration causes a significant performance hit. I suspect this is likely due to the delay incurred for pinning the memory in the NIC before the transfer into the buffers can occur. As opposed to having the buffers already pinned before the data arrives. Basically, from the perspective of the pinning operation, all messages are now dealt with as though they are unexpected.
This PR introduces a memory registration callback which the parent provider can use to register the memory with the core provider on the receive path, if such a callback is provided.
The caveat here is that this can only be done if the peer to receive from is known. So this approach is not done when the source address is FI_ADDR_UNSPEC.
Return the SRX peer callbacks to the parent provider: #10031
There are two reasons I think this change makes sense:
ADD FI_PEER as a capability bit: #10030
LINKx can only link providers which support the peer infrastructure. I added the FI_PEER bit as a capability, this way when the LINKx provider attempts to link providers it is able to check whether they support the peer infrastructure
Beta Was this translation helpful? Give feedback.
All reactions