-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve ssh responsiveness on slow Wi-Fi networks #222
Conversation
…re at 4 Mbps and below. previously 300 Kbps meant no ssh packets from the device indefinitely
Confusingly, the WiFi driver treats the TOS packet octet as a DSCP (DiffServ), and To go from full TOS packet octet to DSCP you do: Just in case the pre-DSCP TOS bits matter to
code to generate thisfor i in range(128):
tos = i << 1
has_min_delay = bool(tos & (1 << 4))
dscp = (tos >> 2) & 0x3f
print(tos, hex(tos), dscp, dscp_map[dscp], f'{has_min_delay=}') |
Nice way to plot custom data live, this is for queued ssh packets: |
Problem
Our traffic control routing consists of a parent
mq
qdisc that encompasses 5 hardware tx queues with apfifo_fast
attached to each.pfifo_fast
is supposed to prioritize interactive traffic using the ToS/DSCP field, but it is unclear why it does not seem to prevent serious ssh stability issues when the device is uploading a file on a slow network.I am guessing that we are saturing the one hardware queue we are currently using, making the
pfifo_fast
useless as it can't stuff any more interactive ssh packets into the single hardware queue as it waits for the file packets to send.Running a speed test, both traffic goes out on one hardware queue (
parent :3
):Attempted solution
I tried adding a custom
prio
qdisc with 5 bands to replace the defaultmq
one, but with just thepriomap
alone, it still kept everything on the same band. This is when I learned that the WiFi driver was resetting every Linux packet priority to 0 before it even got to theprio
qdisc because it assumes you won't mess with traffic control. It wants to keep all of the queue prioritization inside the driver:https://github.com/commaai/agnos-kernel-sdm845/blob/737a024adaf5aedf2a50500939a13c6a2e7283d2/drivers/staging/qcacld-3.0/core/hdd/src/wlan_hdd_wmm.c#L1592
BTW the DSCP map is literally just:
Since the ssh default DSCP value is 0x10, that lumps it in with normal high bandwidth traffic of 0x0:
https://github.com/commaai/agnos-kernel-sdm845/blob/737a024adaf5aedf2a50500939a13c6a2e7283d2/drivers/staging/qcacld-3.0/core/hdd/src/wlan_hdd_wmm.c#L1512-L1513
Solution
Once I moved it to a different hardware queue via raising the ssh QoS field, the ssh connection remains much more responsive on heavily degraded AP conditions while uploading (200-300 Kbps).