-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query: (1) why throughput higher than theoretical, (2) why lower throughput with 128B payload. #81
Comments
Hi,
In the throughput notebooks the overhead is shown:
|
|
To double check, I can run the vanilla un-modified VNx notebook to see if the difference is still there. Since the difference is in TX throughput itself, I would expect the same difference to still be there as TX should not be affected by DUT much. Increase in RX throughput could be maybe from packet duplication (though I have no reason to believe that packets would be duplicated). |
How do you get the 174B in here? Each of the individual IP that compose the network layer needs one or two clock cycles extra to process each packet/segment. I suppose that for 128-Byte the extra cycles stack impacting the throughput. This is something I haven't profiled, because for bulk data transfer you will not use small packet size. The low performance for small packet is known. Given the current design, this is unavoidable. Mario |
Sorry typo there. For 64B payload (64+46=110B frame) |
I understand that performance will be low for small packets. The thing I don't understand is why performance for 64B payload is better than performance for 128B payload. I get better throughput with 64B payloads. Throughput I measure for payload level using provided notebook is Even in the provided notebook, the throughput with 128B payloads is 5 Gbps lower than theoretical while for 64B payloads, it is close to theoretical, i.e., the efficiency for 64B payloads (smaller payload) is better than that with 128B payloads (larger payload). |
Also any thoughts on why TX throughput might be higher than theoretical? |
This is how I compute these throughputs udp = 8
ip = 20
eth = 14
fcs = 4
ifg = 12
pp_amble = 8
def thr(payload_size: int):
total_bytes = payload_size + udp + ip + eth + fcs + ifg + pp_amble
payload_thr = payload_size / total_bytes
frame_thr = (payload_size + ip + udp + eth) / total_bytes
return payload_thr * 100 , frame_thr * 100.0 So, I think your theoretical equation does not look right, as the payload (segment) size increases, the overhead decreases (the deficiency increases). Correct me if I am missing something. |
Someone already asked about this here, I suppose your theoretical throughput is what this person calls naked cmac, still does not mach the numbers I showed above. But, I believe the Python snippet is the correct way to compute this. |
My defs are same except:
In above table the measured at payload level is directly reported by the notebook, i.e., from In this table, there are 2 weird things,
|
How many packets are you sending for each payload size?
|
For the above table, I sent 1 billion packets (basically the PRODUCER is set as in the notebook). Results are similar for 1 million packets as well.
|
|
I am not sure what is the best place to ask this question, hence asking here as an issue. Please let me know if some other place is preferred over this.
Q1: I get cases where observed throughput is higher than theoretical maximum, e.g., 97 Gbps throughput with 1472B payload where the theoretical is around 95 Gbps. 2 Gbps is large enough difference that I can't explain using a few cycle measurement mismatch. I noticed the recently changed juptyer notebook also shows this but the mismatch is only 0.0005 Gbps or so. What might be the reasons for higher than theoretical throughput measurement. If we under measure the time to receive packets by even 100 cycles, that may explain 0.0005 Gbps mismatch but not 2 Gbps mismatch.
Q2. I observe relatively lower throughput with 128B payload. Why might that be? Is it to do with bytes transmitted per flit? If so, 64B payload then should also have similar issue (55 bytes per flit). But we see that only 128B payload has ~5 Gbps gap to theoretical maximum, while other payload sizes don't have much gap.
Calculation:
Flit is AXI transfer data width (64B). For 128B payload, frame size = 128 + UDP (8), IP (20), Ethernet(14) and FCS (4) = 174B, requires 3 flits (3 * 64B >= 174B). Or 174/3 bytes sent per flit = 58.
Similarly for 64B payload, bytes per flit is 55. For other payload sizes bytes per flit is >= 60.
The text was updated successfully, but these errors were encountered: