Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add input stream skipping for oversized payloads to prevent TCP RST. #1313

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

halotukozak
Copy link
Member

Modify the SttpRestCallTest to reproduce the bug

class SttpRestCallTest extends SttpClientRestTest with RestApiTestScenarios {
  test("too large binary request") {
    var i = 0
    while (true) {
      println("i: " + i)
      i += 1
      val future = proxy.binaryEcho(Array.fill[Byte](maxPayloadSize + 1)(5))
      val exception: HttpErrorException = future.failed.futureValue.asInstanceOf[HttpErrorException]
      assert(exception == HttpErrorException.plain(413, "Payload is larger than maximum 1048576 bytes (1048577)"))
    }
  }
}

// Jetty sees that you won't stream the body, and marks the "read" end of the connection is needing to close.
// Once the response is written, Jetty closes the connection. Since there is data that hasn't been read in the kernel buffer,
// that will trigger sending a TCP RST.
request.getInputStream.skipNBytes(length)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consuming whole content is not ideal since it bypasses protection against uploading too much data, but I don't have better idea other than changing test to accept RST as acceptable answer

Comment on lines +122 to +123
// Once the response is written, Jetty closes the connection. Since there is data that hasn't been read in the kernel buffer,
// that will trigger sending a TCP RST.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that if this were the root cause, then it'd always fail, instead of just sometimes, but it's probably pretty close to the truth.

Please show your work (add tcpdumps from communication?), since I think that it might be something more along the lines of:

  1. client sending request body through tcp, chunk after chunk
  2. server beginning handling the request, with the full body not yet sent
  3. immediately dropping the request due to the content being too long
  4. (!) race condition in the client - it either handles the error response, or tries sending another chunk of the request, and gets TCP RST due to the connection being already closed by jetty

which is fixed here, because you wait for the entire message to be sent by the client before responding.

In that case, it might require fix on the client side.

The concern raised by freddie is very real - it's basically useless to set maxPayloadSize if we wait to read it all anyway, it's opening us up to a very simple DOS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client side is also us, so the fix may be possible within the same repo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netty/netty#6818 (comment) <- this seems related, TCP-RST can cause a flush of the kernel buffer on the receiving side, which would also make this flaky & cause the 'early EOF' exceptions, making the client unable to read the response. Not sure what implications on the solution this has though (they seem to have settled on not closing the tcp socket until all data is read?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. My solution is mainly based on this issue When body is unused, large data causes broken pipe hyperium/hyper#2384 (comment).
  2. Konrad, I understand it the same way
  3. You have to wait for the tcpdumps, because it's hard to observe it when it fails once in 3000 attempts xd
  4. "reading" the whole body is not the solution I want. It's the solution that works.
  5. I start research how to fix it on client side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants