Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe the twin-socket feature in the spec #775

Merged
merged 8 commits into from
Sep 15, 2023
86 changes: 62 additions & 24 deletions docs/vfio-user.rst
Original file line number Diff line number Diff line change
Expand Up @@ -204,12 +204,32 @@ A server can serve:
1) one or more clients, and/or
2) one or more virtual devices, belonging to one or more clients.

The current protocol specification requires a dedicated socket per
client/server connection. It is a server-side implementation detail whether a
single server handles multiple virtual devices from the same or multiple
clients. The location of the socket is implementation-specific. Multiplexing
clients, devices, and servers over the same socket is not supported in this
version of the protocol.
The current protocol specification requires dedicated sockets per
client/server connection. Commands in the client-to-server direction are
handled on the main communication socket which the client connects to, and
replies to these commands are passed on the same socket. Commands sent in the
other direction from the server to the client as well as their corresponding
replies can optionally be passed across a separate socket, which is set up
during negotiation (AF_UNIX servers just pass the file descriptor).

Using separate sockets for each command channel avoids introducing an
artificial point of synchronization between the channels. This simplifies
implementations since it obviates the need to demultiplex incoming messages
into commands and replies and interleave command handling and reply processing.
Note that it is still illegal for implementations to stall command or reply
processing indefinitely while waiting for replies on the other channel, as this
may lead to deadlocks. However, since incoming commands and requests arrive on
different sockets, it's possible to meet this requirement e.g. by running two
independent request processing threads that can internally operate
synchronously. It is expected that this is simpler to implement than fully
asynchronous message handling code. Implementations may still choose a fully
asynchronous, event-based design for other reasons, and the protocol fully
supports it.

It is a server-side implementation detail whether a single server handles
multiple virtual devices from the same or multiple clients. The location of the
socket is implementation-specific. Multiplexing clients, devices, and servers
over the same socket is not supported in this version of the protocol.

Authentication
--------------
Expand Down Expand Up @@ -488,21 +508,33 @@ format:

Capabilities:

+--------------------+--------+------------------------------------------------+
| Name | Type | Description |
+====================+========+================================================+
| max_msg_fds | number | Maximum number of file descriptors that can be |
| | | received by the sender in one message. |
| | | Optional. If not specified then the receiver |
| | | must assume a value of ``1``. |
+--------------------+--------+------------------------------------------------+
| max_data_xfer_size | number | Maximum ``count`` for data transfer messages; |
| | | see `Read and Write Operations`_. Optional, |
| | | with a default value of 1048576 bytes. |
+--------------------+--------+------------------------------------------------+
| migration | object | Migration capability parameters. If missing |
| | | then migration is not supported by the sender. |
+--------------------+--------+------------------------------------------------+
+--------------------+---------+-----------------------------------------------+
| Name | Type | Description |
+====================+=========+===============================================+
| max_msg_fds | number | Maximum number of file descriptors that can |
| | | be received by the sender in one message. |
| | | Optional. If not specified then the receiver |
| | | must assume a value of ``1``. |
+--------------------+---------+-----------------------------------------------+
| max_data_xfer_size | number | Maximum ``count`` for data transfer messages; |
| | | see `Read and Write Operations`_. Optional, |
| | | with a default value of 1048576 bytes. |
+--------------------+---------+-----------------------------------------------+
| migration | object | Migration capability parameters. If missing |
| | | then migration is not supported by the |
| | | sender. |
+--------------------+---------+-----------------------------------------------+
| twin_socket | boolean | Indicates whether the client wants to use a |
| | | separate channel for server-to-client |
| | | commands. If specified and the server |
| | | supports it, it will include the file |
| | | descriptor for the client end of a separate |
| | | socket pair along with its reply. Some server |
| | | implementations may not support this, but it |
| | | is strongly recommended for servers which do |
| | | send server-to-client commands to implement |
| | | twin-socket support. |
+--------------------+---------+-----------------------------------------------+

The migration capability contains the following name/value pairs:

Expand All @@ -517,7 +549,11 @@ Reply
^^^^^

The same message format is used in the server's reply with the semantics
described above.
described above. In case the client set ``twin_socket`` to true in its
capabilities, the server may include a file descriptor to use for the
server-to-client command channel in the reply. The index of the file descriptor
in the ancillary data of the reply is given by the ``twin_socket`` capability
mnissler-rivos marked this conversation as resolved.
Show resolved Hide resolved
field in the reply.

``VFIO_USER_DMA_MAP``
---------------------
Expand Down Expand Up @@ -1399,7 +1435,8 @@ Reply
-----------------------

If the client has not shared mappable memory, the server can use this message to
read from guest memory.
read from guest memory. This message and its reply are passed over the separate
server-to-client socket if negotiated at connection setup.

Request
^^^^^^^
Expand Down Expand Up @@ -1437,7 +1474,8 @@ Reply
-----------------------

If the client has not shared mappable memory, the server can use this message to
write to guest memory.
write to guest memory. This message and its reply are passed over the separate
server-to-client socket if negotiated at connection setup.

Request
^^^^^^^
Expand Down