Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Load balancer concurrency #4

Merged
merged 3 commits into from
Sep 11, 2023
Merged

Conversation

DavidNix
Copy link
Contributor

@DavidNix DavidNix commented Sep 11, 2023

When you mix several sync primitives (mutex, wait groups, etc.) that's typically a code smell.

The goal of the load balancer is to choose the first available listener than can serve the request.

This removes the primitives in favor of a single channel that sort of acts like a semaphore.

It also has the advantage of preventing extra goroutines. Previously, on each SendRequest N goroutines would fire up where N = number of listeners. You could imagine if SendRequest were called a lot that could explode goroutines.

Now there are 0 extra goroutines.

@DavidNix DavidNix changed the title refactor: Load balance concurrency refactor: Load balancer concurrency Sep 11, 2023
Comment on lines -111 to -115
return sl.SendRequestLocked(request)
}

// SendRequest ensures there is a connection, sends a request and waits for a response
func (sl *SignerListenerEndpoint) SendRequestLocked(request privvalproto.Message) (*privvalproto.Message, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the SendRequestLocked because I don't think the caller should ever need to use the underlying mutex. That is a recipe for deadlocks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recipe for deadlocks with future changes on accident, I agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, for changes in the future where we forget the intricacies.

Comment on lines +31 to +32
lis := <-lb.avail
defer func() { lb.avail <- lis }()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the main change.

Pull a listener off the queue (i.e. channel). Then put it back when we're done with it. Line 31 will block if there are no available listeners. Line 32 will never block because the chan buffer is equal to the number of listeners.

@DavidNix DavidNix marked this pull request as ready for review September 11, 2023 23:14
Copy link
Member

@agouin agouin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much nicer to read. It has the deficit that if a connection ping is in progress, a listener might be chosen even though it's not done, but I don't think that would significantly impact signing uptime.

@DavidNix
Copy link
Contributor Author

I'll have a followup PR to handle the case where the SignerListener is currently handling a request.

@DavidNix DavidNix merged commit 322b9bc into main Sep 11, 2023
3 checks passed
@DavidNix DavidNix deleted the nix/refactor/lb-concurrency branch September 11, 2023 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants