-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection recovery hangs after EOFException is thrown #53
Comments
On line 99 in ConnectionHandler this if statement is never run because the argument is evaluated to false
If I print the stack trace just before that if statement I have this:
And the ShutdownSignalException exception 'in the middle' have _hardError= true and _initiatedByApplication=false _reason = null At this point I have no clue on how to dig deeper unfortunately, but if we can't trust this library in all error cases we might have o roll our own which would be a shame :-( You might want to consider adding a little more logging in this scenario also. I also looked at #52 but I don't know if this is caused by the same thing or not so I created a new issue just in case. This is also 100% reproducible on my machine and always seem to works on the Mac machine. If you want we can help you with setting up the junit test env so you can try to reproduce it yourself. |
Hi @karlney Interesting difference between platforms. That's one of the problems I've faced is which errors should be considered recoverable and which shouldn't, since the exceptions you'll see for the same failure can by platform. If you have a reproducer for this that you could share, that would be great. In the meantime, consider that you can get and modify the set of exceptions that Lyra will attempt to recover from which should resolve this situation for you: http://jodah.net/lyra/javadoc/net/jodah/lyra/config/Config.html#getRecoverableExceptions-- As for what the appropriate solution should be... I'm not sure. Basically, we could add EOFException as one of the default exceptions to recover from. I'm just not sure how appropriate that is given the odd nature of this failure. Thoughts? |
@jhalterman your remark about |
Got this reproducible. Use docker-compose to start a rabbitmq server: docker-compose.yml
Lyra config:
Do a docker restart for the rabbit server.
Plattform: Docker for Windows using Windows 10. Docker is running on a Hyper-V Linux instance. |
Answered in #53 (comment). Lyra has a configurable list of exceptions to try. If there's interest in revisiting the default list, please file a separate issue (or submit a PR). |
Lyra version 0.5.2 (also tested 0.5.3-SNAPSHOT with the same result)
amqp-client 3.5.5
Broker version 3.5.6
We have an automatic test suite of a java rabbit library built on top of lyra and amqp-client.
We use docker and makes the junit test start and stop a real rabbit broker on the same machine that runs the tests.
One of the tests simulates a broker crash by using the docker kill command on the broker image at the same time as we are consuming from a queue on the broker.
We use persistent messages so when the broker starts up again after a few seconds the messages are still in the queue BUT it seems that Lyra does not re-connect to the broker in all cases.
In my colleges machine, which is a Mac then Lyra correctly re-connects when the broker starts up again. But on my linux machine only one re-connection attempt is made then everything freezes.
The difference we can see is that on a Mac we get a Socket 'connection refused' exception but on linux it is an java.io.EOFException that is thrown.
This is a snippet from the logs in the test suite we have:
The text was updated successfully, but these errors were encountered: