Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with "Connection reset" for DynamoDB client #40

Open
draxly opened this issue Feb 23, 2020 · 7 comments
Open

Problem with "Connection reset" for DynamoDB client #40

draxly opened this issue Feb 23, 2020 · 7 comments

Comments

@draxly
Copy link

draxly commented Feb 23, 2020

I'm using VertxSdkClient.withVertx to create my non-blocking DynamoDB client and it works. However, I see some occurrences of "Connection reset java.net.SocketException: Connection reset" (the full stack trace is given below) while running my application.

To try and get some more feedback, I added an exceptionHandler to VertxNioAsyncHttpClient like this:
`private HttpClient createVertxHttpClient(Vertx vertx) {
HttpClientOptions options = new HttpClientOptions()
.setSsl(true)
.setKeepAlive(true);

    return vertx.createHttpClient(options).connectionHandler(con -> {
        con.exceptionHandler(err -> {
            logger.error("VertxNioAsyncHttpClient connectionHandler.exceptionHandler: " + err.getMessage(), err);
        });
    });
}`

This exceptionHandler is getting called.
Any ideas what causes the "Connection reset" or what I can do to avoid them?

The full stack trace as being logged:
VertxNioAsyncHttpClient connectionHandler.exceptionHandler: Connection reset java.net.SocketException: Connection reset at java.base/sun.nio.ch.SocketChannelImpl.throwConnectionReset(SocketChannelImpl.java:345) at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:376) at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:247) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1147) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:830)

@aesteve
Copy link
Collaborator

aesteve commented Feb 25, 2020

Hello, unfortunately no, I have no idea what is going on, and the stack trace doesn't help me that much.

I'd guess some "proxy" or network hardware may be shutting down the connection, but that's a hard one to debug without involving tcpdump or something like that.

I'll let the issue opened, in case someone else has an idea, or want to report the same bug with more details, more in-depth analysis.

Sorry I couldn't help more on this :\

@draxly
Copy link
Author

draxly commented Mar 3, 2020

Thanks aseteve, I appreciate the answer! I will continue digging into it on my end.

@Alars-ALIT
Copy link

Reducing the keep alive timeout of the httpClient in VertxNioAsyncHttpClient seems to solve this issue.

HttpClientOptions options = new HttpClientOptions()
.setSsl(true)
.setKeepAlive(true)
.setKeepAliveTimeout(30);

The timeout defaults to 60s.

When setting the timeout to 70s, I get more 'Connection reset's

When setting the timeout to 50s, I got a few resets

When setting the timeout to 30s, I get no resets

So I assume AWS may close connections thats been idle for 30s - <50s. I can't find any documentation about this though.

@aesteve
Copy link
Collaborator

aesteve commented Mar 30, 2020

Are you sure it's AWS though?

Couldn't it be intermediate networking elements (like load balancers, or stuff like that?)

@draxly
Copy link
Author

draxly commented Mar 31, 2020

If DynamoDB service is set up using "VPC Endpoint for DynamoDB" and the DynamoDB client is configured without any special settings, can there still be some intermediate networking element between?

@aesteve
Copy link
Collaborator

aesteve commented Mar 31, 2020

Not sure, really, just trying to investigate out loud here :\

I have had some problems of keep-alive in the past with Elastic Load Balancers for instance, shutting down the connection after 60s (when using Server Sent Events for instance).

If just setting a keep-alive on the http client fixes it, that's already a good point.

Just trying to know if this should be documented or if it just happens in some specific use-cases.

@wem
Copy link
Contributor

wem commented Mar 6, 2021

We can confirm the observation of @Alars-ALIT ... With the keep alive timeout of 30s the problem was solved.

Another problem is the retry policy. We get some connection closed exceptions, what's looks like an AWS server issue. The AWS SDK holds a list of exception types it will retry for. Unfortunatelly, for io.vertx.core.VertxException the SDK will not do any retry as the condition is missing.
I will create an issue and PR next week, so the users would have a proper retry policy available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants