-
Notifications
You must be signed in to change notification settings - Fork 151
Treat timeout error before pool shutting down error #359
base: master
Are you sure you want to change the base?
Treat timeout error before pool shutting down error #359
Conversation
@wandenberg Do you believe this permanently fixes the |
@dblock I believe yes. It was a not well handled exception that set the node as down and try to close the connections when it was not able to get a connection from the pool, it even used the connection to know if is broken, resulting in some wrong interpretations like the |
@arthurnn Bump? |
@arthurnn can you take a look? We just upgraded to Moped 2 in production last night and have been wrecked by this bug so far. |
@wandenberg I just applied this patch to prod, will let you know if we see the connection pool shutdown. We're still getting the |
@wandenberg unfortunately even with this patch, this just happened and didn't go away until we restarted services. I guess it's possible that there are other scenarios in which this would happen and your patch fixes a subset of them. |
+1, @arthurnn any update on this one? |
Give this branch a try. We upgraded from 1.5 to 2.0 about 3 weeks ago and have seen absolutely horrible failover handling with Moped 2.0. We finally now can do stepdowns in production without a single error and haven't seen this error anymore. I cherry-picked in various commits from other pulls (such as @wandenberg's) that address this and also added many commits of my own to handle different failure scenarios. https://github.com/jonhyman/moped/tree/feature/15988-and-logging It has some extra logging in there that I've been using as we've been doing failover testing, so feel free to fork and remove if you inspect your Moped logs. We've also tested |
@jonhyman right now, we have a some issues in productions in dozens of server and websites + api that is used by many vendors, movie studios and our apps. Right now, if any thing happens to a node in the replication , we are getting |
Give my branch a try, see if it helps. |
@arthurnn Bump! |
@jonhyman your branch seems to get rid of the pool shutdown error. are you using this in production? |
Yeah we are. And we've done numerous stepdowns in prod without issues with Sent from my mobile device
|
…ing a connection from pool
3e43409
to
49f65ef
Compare
…tion/authorization error
@jonhyman Hey, are the issues you mention in your comment fixed in 2.0.7 which contains #380? Or do you still use a fork? |
Yeah it should all be fixed in 2.0.7. We're still on my fork because we've stopped putting any resources behind Moped (even if it is just |
When the connection_pool gem is not able to return a connection from the pool during the time configured at pool_timeout, it raise a Timeout::Error.
Which is not properly handled and result on an attempt do set the node as down.
Resulting in a invalid state transformed in to ConnectionPool::PoolShuttingDownError exception.
This pull request was done using the script posted by @InvisibleMan at #353.
I also applied this commit to operation_timeout branch