Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloseConnection considered harmful #641

Open
wants to merge 6 commits into
base: devel
Choose a base branch
from

Conversation

ghjm
Copy link
Contributor

@ghjm ghjm commented Jul 11, 2022

What's wrong with this code? (Other than the lack of error checking.)

n := netceptor.New(...)
conn, _ := n.Dial("node2", "mysvc", nil)
fmt.Fprintf(conn, "Hello, world!\n")
conn.Close()

Answer: It doesn't close the connection. A PacketConn and QUIC connection, with the associated dozen-or-so goroutines, are still running, and will remain running for the life of your program.

Somehow (and I'm well aware that this is code that I probably wrote), in Receptor-land it turns out that Close() has the meaning we would normally associate with CloseWrite() - it shuts down the write side of the connection and indicates an EOF to the other side, but leaves the connection open and ready to continue receiving data from the other side. If you want to actually close the Receptor connection, you have to call CloseConnection(), which has the effect we would normally associate with Close().

If you don't know about this and just call Close(), expecting Receptor to behave like any other networking library, you'll wind up leaking goroutines and memory like crazy. And given that CloseConnection() only appears in the Receptor code in a few places in Workceptor, I'm imagining that long-running Receptor instances probably do leak goroutines and memory like crazy.

This PR renames the functions to the sensible names, and updates Workceptor to call the right functions. If other parts of the code are relying on Close() not really closing the connection, then this will break them, but arguably they were already broken since they were leaking connections. Also, this is a breaking change for API users, so should be in a new minor release.

@ghjm ghjm requested review from fosterseth and shanemcd July 11, 2022 20:09
@ghjm ghjm force-pushed the sane_close_names branch 2 times, most recently from 214fbe4 to e4d3ff9 Compare July 12, 2022 01:52
@AaronH88
Copy link
Contributor

Hi @ghjm
I am reviewing all PRs to tidy up and I really like this one.
Are you ok to rebase?
No problem if not, I can take this as inspiration and create a new PR 😄

c.pc.cancel()
// Close closes the connection.
func (c *Conn) Close() error {
if !c.fromListener {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this packet conn only can be cancelled this way please?

I am struggling to understand when fromListener will be set to false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember right, fromListener is set based on whether a Conn object was created by the user opening it, or was created by a listener receiving a connection. If the Conn is standalone, then when we close it, we need to close the underlying PacketConn. But if the Conn came from a listener, the PacketConn is shared, and should only be closed when the listener is closed.

Handler: mux,
Addr: b.address,
Handler: mux,
ReadHeaderTimeout: 10 * time.Second,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any other reason for this timeout other than tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a mitigation for the Slowloris attack.

@ghjm
Copy link
Contributor Author

ghjm commented May 22, 2023

I can take a look at rebasing it later today, but I don't have time to fix non-trivial issues with it if it doesn't apply cleanly, or fails unit testing or what have you.

I vaguely remember there being some problem why this didn't get merged, but I don't remember what.

@AaronH88
Copy link
Contributor

I can take a look at rebasing it later today, but I don't have time to fix non-trivial issues with it if it doesn't apply cleanly, or fails unit testing or what have you.

I vaguely remember there being some problem why this didn't get merged, but I don't remember what.

Sounds great @ghjm

Let me know, happy to take over this PR if its more suitable as I think it adds a lot of value!
Thanks 😄

@ghjm
Copy link
Contributor Author

ghjm commented May 29, 2023

@AaronH88 if you've got time to work on this then maybe you'd better take it over. I'd love to mess with it but I'm just not finding the time.

@AaronH88
Copy link
Contributor

@AaronH88 if you've got time to work on this then maybe you'd better take it over. I'd love to mess with it but I'm just not finding the time.

No problem at all!
Thanks for getting back to me.

@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
8.9% 8.9% Duplication

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants