Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB device erratum for aborted control transfers? #385

Open
tlyu opened this issue Feb 25, 2024 · 5 comments
Open

USB device erratum for aborted control transfers? #385

tlyu opened this issue Feb 25, 2024 · 5 comments
Labels
rp2040 Concerning the RP2040 chip

Comments

@tlyu
Copy link

tlyu commented Feb 25, 2024

It seems that macOS sometimes rapidly issues multiple CDC control requests back-to-back, aborting some of them, when a process closes a CDC serial device. This causes a hard fault in TinyUSB due to buffer control state being retained when a new SETUP packet arrives before the previous control transfer completed.

The datasheet says that the STALL bits get reset by an incoming SETUP packet, but not what happens to the EP0 buffer controls. The USB peripherals on some other MCUs (e.g., STM32) do reset the endpoint state for EP0 when a SETUP packet arrives, regardless of whether the previous control transfer completed.

I don't have detailed traces of what's happening at the hardware level, because I don't currently have a RP2040 board with accessible SWD traces.

I've submitted a workaround to TinyUSB, but this seems like a hardware erratum, because there's no guarantee that the buffer control won't cause an unwanted transaction to occur before the ISR can service the interrupt for the new SETUP and adjust the buffer controls.

See also:

adafruit/circuitpython#8824 (analysis and traces)
hathach/tinyusb#2492 (workaround patch)

@aallan aallan added the rp2040 Concerning the RP2040 chip label Feb 25, 2024
@aallan
Copy link
Contributor

aallan commented Feb 25, 2024

Sounds like one for @liamfraser ?

@liamfraser
Copy link

The SETUP packet does not clear EP0's stall bits directly. This would have been tricky to implement because the stall bits come from a memory rather than a register. Instead, the bits in the EP_STALL_ARM Register are cleared. To actually send a stall, you need both the stall bit in the buffer control memory to be set, along with the ARM bits. So if only the memory bit is set but the register is cleared then you can tell the stall is no longer valid.

@tlyu
Copy link
Author

tlyu commented Feb 26, 2024

@liamfraser thank you for the details. They are very helpful.

From your reply, I can conclude that with current hardware, race conditions regarding EP0 buffer control state are unavoidable (except for inhibiting/clearing STALL). Also, fixing them might be tricky, because endpoint ACK/NAK/STALL configuration is in memory instead of registers. Do you agree?

My current workaround uses the ABORT register, which probably doesn't work on B1 hardware or earlier (due to RP2040-E2). I'm guessing that the hardware fix might be to force the ABORT bits for EP0 upon receiving a SETUP packet? I'm assuming that the same hardware limitations that led to using additional register bits to mask the effect of the STALL bits in the EP0 buffer controls would apply for the AVAILABLE bit.

@tlyu
Copy link
Author

tlyu commented Mar 5, 2024

Some tracing added by @eightycc seems to confirm that the USB peripheral doesn't reset anything in the EP0 buffer controls when receiving a SETUP packet.

@danielstuart14
Copy link

danielstuart14 commented Apr 26, 2024

I think this is something that I'm seeing on my board, even with the latest tinyusb (after patch). (I also tested with Embassy-USB, same thing).
We have a board that has an USB2422 HUB that connects both an RP2040 (B2 Revision) and an external USB port.

Sometimes, when connecting a new device to the external port, the RP2040 will stop communicating with the O.S. (we are using CDC, device is still recognized, just won't accept any messages - ERR 13 on windows):

image

USB2422 Errata says that there is an issue that can happen with split transactions, we checked the RP2040 USB signals and it seems like one packet sometimes looses some of its data:
image

The problem is that the RP2040 isn't able to recover from this issue, and just stops acknowledging the setup packets from the HOST:
image

If we try to send any new data through the CDC interface, this is what happens in the bus:
image

1 - The host sends the data, but no ACK/NACK is received
image

2 - The host tries to receive some data (44us later):
image

3 - Device responds an ACK (9us later):
image

4 - Nothing is sent for 30us and then the HOST retries...

The only things that we found that would make it work again are:

  • Reseting the RP2040
  • Reseting the HUB (by its reset pin)
  • Disabling the RP2040 Composite Device in device manager and re-enabling it again.

For the two latter cases, we found out that the USB line reset is what makes it work again.
image

@tlyu is this something that your patch should fix? By reading both the description here and in your commit I thought it was similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rp2040 Concerning the RP2040 chip
Projects
None yet
Development

No branches or pull requests

4 participants