Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add initial fabric erisc data mover (EDM) impl
Note only supports line topologies. Fabric mcast currently untested and is work in progress. In the mean-time for functional bringup of fabric EDM users, replace mcast with looped unicasts. The fabric Erisc Data Mover (EDM) is a component that can be used to build *very* simple linear topology fabrics. One of these EDMs can be instantiated on each ethernet link. It is built from 3 "channels" (though the definition of channel here is a little loose since two of the 3 will merge traffic, so this setup could be interpreted as a two channel setup.). This EDM implements packet based packets only - concepts like sockets are not supported. !! EDM Structure There are two sender channels and one receiver channel. "Sender" and "receiver" are relative to the Ethernet link, not the chip. Sender sends over the link and receiver receives from the link. Each sender channel serves a different purpose: - Sender channel 0 : Accepts packets from a workers on the local chip - Sender channel 1: accepts packets from an upstream EDM (i.e. an upstream EDM receiver channel on the same chip but different core) The receiver channel accepts packets from the Ethernet link and can do one (or both) of: - Write the packet to local chhip if it is the intended destination (unicast or mcast) - Forward the packet to the next chip in the line if: - Unicast and not the target chip - Multicast and this chip is in the multicast target range Sender channels will merge traffic into the remote EDM's receiver channel. !! Building a "Fabric" At present, only linear topologies are supported, and one per ethernet link along that given line. Below shows the intended connectivity of EDMs across chips in a hypothetical 3-chip fabric. For longer lines, the pattern would be extended. !! Connecting Workers to Channels As mentioned, only one worker can push to a given EDM sender channel at a time. In order to send to an EDM sender channel, the worker must establish a connection. The connection protocol is as follows and is started by the worker (the EDM is a slave in this protocol). *NOTE*: If multiple workers try to connect to the same EDM sender channel at the same time, the behavior is undefined. *NOTE*: Additionally, if a worker pushes packets to a channel it isn't connected to, behaviour is undefined. *NOTE*: Undefined == likely hang The `WorkerToFabricEdmSender` from `ttnn/cpp/ttnn/operations/ccl/kernels/edm_fabric/edm_fabric_worker_adapters.hpp` provides an implementation of the connection protocol. `WorkerToFabricEdmSender` also acts as a wrapper around that protocol so workers can simply call `open()` to execute the connection protocol without having to manually reimplement for each kernel. !!! Protocol Worker: - Read from EDM sender channel buffer_index address - Required so that the worker knows where to write its first packet (since the channel may already contain packets from a previous connection) - Write worker core X/Y (NOC 0 based) - Write worker flow control semaphore L1 address EDM Sender Channel: - Check local connection valid semaphore for new established connection - When the connection semaphore indicates an active connection, the channel assumes all other relevant fields were correctly populated by the worker: - Worker core_x (on NOC 0) - Worker core_y (on NOC 0) - Worker flow control semaphore L1 address !! Tearing Down Connections Every worker is required to explicitly teardown its connection with the EDM before terminating. To do this, the worker must simply write a `0` to the EDM sender channel's connection semaphore address. As long as the worker has sent all of its packets to the EDM before this, then the EDM will guarantee to forward the messages correctly. At this point, it is safe for another kernel to establish a connection. !! Packet Structure Workers are responsible for populating packet headers before sending to the EDM. The packet header structure is defined in `ttnn/cpp/ttnn/operations/ccl/kernels/edm_fabric/fabric_edm_packet_header.hpp`. !! Channel structure Each EDM channel is built from one or more buffers. Each buffer is the same size and can hold atmost one packet. Neighbouring packets occupy nehighouring buffers - with the exception of the last buffer index. The next packet after a write into the last buffer index will wrap around to the first buffer index. Even if packets do not occupy the full buffer, subsequent packets will always be written into the next logical buffer. A gap will exist in memory but the EDM will not send that padded data (unless it is more performant - which is possible in some special cases) A detail of the channel structure is omitted from the above description, namely the EDM <-> EDM flow control region for each buffer. Each buffer really looks something like this: &header-> |----------------| channel_base_address | header | &payload-> |----------------| | | | payload | | | &channel_sync-> |----------------| | channel_sync | // This is new ------------------ The "channel_sync" is an `eth_channel_sync_t` and is internal to the EDM implementation and is used to indicate packet transmission state between sender and receiver EDMs. The protocol for its use is: 1) Sender updates the field indicating new data: - set `bytes_sent` to a non-zero value indicating new data - clear `receiver_ack` to 0 - set `src_id` to the sender channel id so the receiver knows who the sender was (and where the ack should go) 2) Sender sends this channel sync to the corresponding location in the receiver channel (either in the same transmission as the packet or separately) 3) Receiver sees that `bytes_sent` is non-zero, indicating a new packet. It sends back an acknowledgement (first level): - set `receiver_ack` to non-zero *NOTE* IMPORTANT: To avoid a race, the receiver must be sure to send its channel_sync_t from a different address it uses as for the second level acknowledgement 3b) When sender receives an ack, it understands it can overwrite its local copy of the packet with new data 4) After receiver properly writes out its packet, it sends a second level acknowledgement, indicating it can receive new data into this specific buffer index: - clear the bytes_sent and receiver_ack fields and send back the `channel_sync` to the sender !! Sending Packets Sending a packet is done as follows: 1) Worker waits for flow control semaphore increment from EDM sender channel - Indicates there is space at the next buffer index for a packet 2) Worker performs a noc write of its packet to the EDM sender channel at the buffer index *NOTE*: !!!ALL PACKETS MUST CONTAIN DESTINATION NOC X/Y AS NOC 0 COORDINATES, REGARDLESS OF THE `noc_index` OF THE SENDER!!! For more diagrams, see `fabric_erisc_datamover.cpp`
- Loading branch information