Problem with "Connection reset" for DynamoDB client #40

draxly · 2020-02-23T15:25:29Z

I'm using VertxSdkClient.withVertx to create my non-blocking DynamoDB client and it works. However, I see some occurrences of "Connection reset java.net.SocketException: Connection reset" (the full stack trace is given below) while running my application.

To try and get some more feedback, I added an exceptionHandler to VertxNioAsyncHttpClient like this:
`private HttpClient createVertxHttpClient(Vertx vertx) {
HttpClientOptions options = new HttpClientOptions()
.setSsl(true)
.setKeepAlive(true);

    return vertx.createHttpClient(options).connectionHandler(con -> {
        con.exceptionHandler(err -> {
            logger.error("VertxNioAsyncHttpClient connectionHandler.exceptionHandler: " + err.getMessage(), err);
        });
    });
}`

This exceptionHandler is getting called.
Any ideas what causes the "Connection reset" or what I can do to avoid them?

The full stack trace as being logged:
VertxNioAsyncHttpClient connectionHandler.exceptionHandler: Connection reset java.net.SocketException: Connection reset at java.base/sun.nio.ch.SocketChannelImpl.throwConnectionReset(SocketChannelImpl.java:345) at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:376) at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:247) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1147) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:830)

The text was updated successfully, but these errors were encountered:

aesteve · 2020-02-25T15:19:07Z

Hello, unfortunately no, I have no idea what is going on, and the stack trace doesn't help me that much.

I'd guess some "proxy" or network hardware may be shutting down the connection, but that's a hard one to debug without involving tcpdump or something like that.

I'll let the issue opened, in case someone else has an idea, or want to report the same bug with more details, more in-depth analysis.

Sorry I couldn't help more on this :\

draxly · 2020-03-03T08:47:43Z

Thanks aseteve, I appreciate the answer! I will continue digging into it on my end.

Alars-ALIT · 2020-03-30T11:19:46Z

Reducing the keep alive timeout of the httpClient in VertxNioAsyncHttpClient seems to solve this issue.

HttpClientOptions options = new HttpClientOptions()
.setSsl(true)
.setKeepAlive(true)
.setKeepAliveTimeout(30);

The timeout defaults to 60s.

When setting the timeout to 70s, I get more 'Connection reset's

When setting the timeout to 50s, I got a few resets

When setting the timeout to 30s, I get no resets

So I assume AWS may close connections thats been idle for 30s - <50s. I can't find any documentation about this though.

aesteve · 2020-03-30T14:01:42Z

Are you sure it's AWS though?

Couldn't it be intermediate networking elements (like load balancers, or stuff like that?)

draxly · 2020-03-31T08:04:11Z

If DynamoDB service is set up using "VPC Endpoint for DynamoDB" and the DynamoDB client is configured without any special settings, can there still be some intermediate networking element between?

aesteve · 2020-03-31T08:36:33Z

Not sure, really, just trying to investigate out loud here :\

I have had some problems of keep-alive in the past with Elastic Load Balancers for instance, shutting down the connection after 60s (when using Server Sent Events for instance).

If just setting a keep-alive on the http client fixes it, that's already a good point.

Just trying to know if this should be documented or if it just happens in some specific use-cases.

wem · 2021-03-06T07:28:19Z

We can confirm the observation of @Alars-ALIT ... With the keep alive timeout of 30s the problem was solved.

Another problem is the retry policy. We get some connection closed exceptions, what's looks like an AWS server issue. The AWS SDK holds a list of exception types it will retry for. Unfortunatelly, for io.vertx.core.VertxException the SDK will not do any retry as the condition is missing.
I will create an issue and PR next week, so the users would have a proper retry policy available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with "Connection reset" for DynamoDB client #40

Problem with "Connection reset" for DynamoDB client #40

draxly commented Feb 23, 2020

aesteve commented Feb 25, 2020

draxly commented Mar 3, 2020

Alars-ALIT commented Mar 30, 2020

aesteve commented Mar 30, 2020

draxly commented Mar 31, 2020

aesteve commented Mar 31, 2020

wem commented Mar 6, 2021 •

edited

Loading

Problem with "Connection reset" for DynamoDB client #40

Problem with "Connection reset" for DynamoDB client #40

Comments

draxly commented Feb 23, 2020

aesteve commented Feb 25, 2020

draxly commented Mar 3, 2020

Alars-ALIT commented Mar 30, 2020

aesteve commented Mar 30, 2020

draxly commented Mar 31, 2020

aesteve commented Mar 31, 2020

wem commented Mar 6, 2021 • edited Loading

wem commented Mar 6, 2021 •

edited

Loading