-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inform queue leaders of cluster node status changes #356
Comments
Yes something like that. I think, however, we can be somewhat more ambitious in the API perhaps. You are right we should pass the current member configuration along with the call. In fact we may even want to pass the replication state (i.e. last confirmed index) as well so that we have some kind of "freshness" indicator. For example we may not want to auto-grow if one of the members is substantially behind the others. The call should return a list of modifications and the Ra leader will spawn a transient process to perform these changes in turn (start a new ra server for example and join it to the cluster, then wait for replication to catch up before continuing). Whilst it is performing the modifications this callback will not be called, unless another node change is detected. The Ra leader can then ensure that any shared configuration is properly consistent across members (something we have to ensure manually ourselves atm). |
Got it. So, similar to handle_aux, but more specific. We could re-use the logic of PID monitoring too perhaps, to run one change at a time, in sequential order. So, perhaps add something like |
@kjnilsson I have a simple prototype working, and using a gen_statem timeout to trigger the handle_status actions. But, I am wondering what is a good way to make sure timeouts happen on the right node. I.e if a leader triggers a delayed timeout, then for some reason becomes a follower before the timeout triggers... |
I am currently setting this timeout trigger to all states, and moving the headache to the callback implementation. |
Add a new optional callback, or similar, that gets called when a node joins or leaves the Erlang cluster.
The callback can take decisions on what to do with this information, such as adding or removing the node
as one of its members.
Suggestion:
Update
ra_server_proc:leader
state, that already handles nodeup/nodedown, to call a new optional callback.To cause a randomized delay, perhaps add a
erlang:send_after
with a newinfo
message, something like(erlang:send_after(SOMERANDOMNUMBER, self(), {delayed_node_status_update, Node, Status}))
and a new clause to leader, something like
Would perhaps also be good to send along the members of the raft, so that the user code does not have to call ra:members()
The text was updated successfully, but these errors were encountered: