Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opaque Handle to Eliminate Lookup on TX #57

Open
BorisPis opened this issue Mar 21, 2023 · 9 comments
Open

Opaque Handle to Eliminate Lookup on TX #57

BorisPis opened this issue Mar 21, 2023 · 9 comments
Labels
QEO Related to QUIC encryption/decryption offload

Comments

@BorisPis
Copy link

To simplify hardware offload on transmit, it will be useful to use an opaque NIC driver generated handle (e.g., 4 bytes) for each connection ID. This handle should be provided alongside transmitted packets. Hardware can use this handle to reduce the overhead of looking up the state for this packet, and possibly also to skip parsing the QUIC header if software can guarantee a specific format.

Also, can you explain how the ConnectionIdLength would be useful on the NDIS_QUIC_ENCRYPTION_NET_BUFFER_LIST_INFO associated with every packet? Is it really necessary to pass this data---that doesn't change---with every packet of a flow?

@nibanks nibanks added the QEO Related to QUIC encryption/decryption offload label Mar 22, 2023
@nibanks
Copy link
Member

nibanks commented Mar 22, 2023

use an opaque NIC driver generated handle

So, are you suggesting an output uint32_t be added as an output to NDIS_QUIC_CONNECTION, which is then also added to NDIS_QUIC_ENCRYPTION_NET_BUFFER_LIST_INFO in the send path? I think that'd be doable. The OS already has to do a first level lookup to see if a packet needs to be done in SW or HW. If we determine it's done in HW, then we pass the uint32_t down.

can you explain how the ConnectionIdLength would be useful

This is necessary if we don't have the global requirement that all CIDs be the same length, which was proposed on another issue.

P.S. Let's keep separate questions to separate issues. Thanks!

@BorisPis
Copy link
Author

So, are you suggesting an output uint32_t be added as an output to NDIS_QUIC_CONNECTION, which is then also added to NDIS_QUIC_ENCRYPTION_NET_BUFFER_LIST_INFO in the send path?

Yes, exactly.

I think that'd be doable. The OS already has to do a first level lookup to see if a packet needs to be done in SW or HW. If we determine it's done in HW, then we pass the uint32_t down.

Makes sense

@nibanks nibanks changed the title opaque handle representing the connection identifier Opaque Handle to Eliminate Lookup on TX Mar 22, 2023
@nibanks
Copy link
Member

nibanks commented Mar 22, 2023

I've updated this issue to generally track the (good idea) from both @BorisPis and @rawsocket (but at different layers) to have the control path of adding a new connection ID offload return some kind of opaque handle that can then be included in the datapath to eliminate the lookup costs. This could be done at both the UM to KM boundary and SW to HW boundary.

One issue this might cause is synchronization issues around lifetime of the offload info on the control path, with the datapath. We will have to very clear about what restrictions (if any) we'd put on the datapath. Ideally, I'd like to handle the complexity in the control path, and have it gracefully handle a raced send with handle that is getting deleted.

@mtfriesen
Copy link
Contributor

mtfriesen commented May 25, 2023

We want to minimize the number of lookups required, so we're thinking on the TX path of providing the NIC with two IDs per TX:

  • A protection/isolation domain, e.g. a process handle, and is trusted.
  • A connection offload ID within that protection/isolation domain, and may be untrusted.

The idea is that since the NIC [driver/FW/HW] has to look up some state for each connection anyways, we may be able to perform the connection lookup at the same time as enforcing isolation, minimizing duplicated state and duplicated validation.

It would also be useful for the NIC to provide the protection ID associated with each RX packet, especially if offload connection IDs are constrained to a small range.

@BorisPis
Copy link
Author

A connection offload ID sounds to me like a global identifier from the device's perspective rather than an identifier within some protection domain. In general, from the device's perspective, it is best to have as short as possible IDs unique to that device, otherwise there is bound to be some overhead. For example, to match long identifiers (e.g., protection domain ID + connection offload ID) the device may need multiple match operations, but if the device generated these IDs, then it would choose to use IDs that will fit in one match operation.

@nibanks
Copy link
Member

nibanks commented May 26, 2023

We need to provide security boundaries across different applications/processes. One process must not be able to delete the CID offloaded by another. So, something has to do this enforcement.

@rawsocket
Copy link
Collaborator

Security boundary would need to be maintaned and protected by a trusted root. In Linux with sockets API it is the kernel ULP, which could provide a simple virtualization between NIC global set of crypto key IDs and individual per-socket pools. This will not add any new lookups, and only one extra indirection.

The Tx path would only require a subset of parameters to register a key: length of CID, algorithm and keys. The result will be ok or nak.

On the Rx path the isolation would also be necessary to not to allow the random delete from rogue processes. The Rx connection crypto install request will be different as it needs L3/L4 information to later match the flow. In theory, only one empty CID could be there for the same source and destination; but verbally this does not stop from maintaining multiple connections with empty CID coming to the same port on the server and running from different clients. Hence, more than just destination IP/port might be needed to correctly match the flow.

Again, similarly to Tx, the Rx control plane will translate kernel-to-device global crypto index into socket local equivalent and keep the mapping. A single array of redirection and a linked list of free elements would make it o(1) on all operations (also known as hash-linked-list in c++ world).

@rawsocket
Copy link
Collaborator

A connection offload ID sounds to me like a global identifier from the device's perspective rather than an identifier within some protection domain. In general, from the device's perspective, it is best to have as short as possible IDs unique to that device, otherwise there is bound to be some overhead. For example, to match long identifiers (e.g., protection domain ID + connection offload ID) the device may need multiple match operations, but if the device generated these IDs, then it would choose to use IDs that will fit in one match operation.

It might be worth to leave device unaware of protection domain to make this similar across multiple vendors. The kernel should do the work. For XDP use case, the kernel must be involved in some way too for control path to provide separation and use process ID in conjunction with absolute start time to maintain integrity over PID reuse.

@nibanks
Copy link
Member

nibanks commented Jun 1, 2023

This will not add any new lookups

Perhaps in the world where you are applying the offload to a socket (that has a particular tuple bound/listening), but for something more generic like XDP you don't have such an object to align/verify the offload to, to prevent independent apps from conflicting/attacking each other. In this case, you do require an additional lookup if you want to provide any protections in this space.

It might be worth to leave device unaware of protection domain to make this similar across multiple vendors. The kernel should do the work. For XDP use case, the kernel must be involved in some way too for control path to provide separation and use process ID in conjunction with absolute start time to maintain integrity over PID reuse.

I do agree that doing it in the kernel allows for a single (well, maybe two, one for sockets and one for xdp) place for this logic to live, and not require all vendors to do the logic. But my line of thinking is "How complicated is this really for the vendor to implement?" and "Can they do this more efficient than the kernel?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QEO Related to QUIC encryption/decryption offload
Projects
None yet
Development

No branches or pull requests

4 participants