Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Data race in sentries #5

Merged
merged 6 commits into from
Sep 13, 2023
Merged

fix: Data race in sentries #5

merged 6 commits into from
Sep 13, 2023

Conversation

DavidNix
Copy link
Contributor

This includes a bit of code cleanup. Passing around a pointer to a struct *appState lends itself to races (as in this case).

As such, this PR removes appState.

This PR also ensures that we properly close all connections when the process is terminated.

@DavidNix
Copy link
Contributor Author

@agouin What's the best way to test this in the wild? Deploy a custom image to the sentry-testnet vcluster?

It would be ideal to have a dev vcluster one day.

done := make(chan struct{})
cometos.TrapSignal(a.logger, func() {
for _, s := range a.sentries {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One part of the race is we read from the map here.


if err := s.Start(); err != nil {
return fmt.Errorf("failed to start new remote signer(s): %w", err)
}
a.sentries[newSentry] = s
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2nd part of the race is we are reading and mutating the sentries map here. This is in a separate goroutine. There was no guarantee this goroutine would exit while the main() goroutine needing to read the map.

Comment on lines +106 to +107
close(w.stop)
<-w.done
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synchronization here ensures only one goroutine is modifying w.sentries.

@DavidNix DavidNix marked this pull request as ready for review September 13, 2023 16:53
return fmt.Errorf("failed to start listener(s): %w", err)
}
defer logIfErr(logger, loadBalancer.Stop)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously loadBalancer would not have been stopped if watching sentries failed.

@agouin
Copy link
Member

agouin commented Sep 13, 2023

What's the best way to test this in the wild? Deploy a custom image to the sentry-testnet vcluster?

Yes in the sentry-testnet vcluster is good, no concerns if we have uptime issues there.

Copy link
Member

@agouin agouin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@DavidNix
Copy link
Contributor Author

I'll test a prerelease on the testnet sentries

@DavidNix DavidNix merged commit 7c1fefd into main Sep 13, 2023
3 checks passed
@DavidNix DavidNix deleted the nix/fix/sentries-data-race branch September 13, 2023 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants