Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection can't be closed when there is unread DataFrame in the stream #97

Open
bitoku opened this issue May 20, 2024 · 0 comments
Open

Comments

@bitoku
Copy link

bitoku commented May 20, 2024

When a stream gets a DataFrame and nothing reads data from the stream of the connection, it can't close the connection and leads to memory leak.

When I'm working on https://github.com/cri-o/cri-o memory leak issue, I found that in the situation above, spdy stream server causes memory leak.
The leak happens when it tries to close the connection, and it waits wg.Wait() forever.

spdystream/connection.go

Lines 392 to 394 in 57d1ca2

// wait for all frame handler workers to indicate they've drained their queues
// before handling the go away frame
wg.Wait()

The cause of the issue is that it can't finish frameHandler() because of the deadlock.
There are some workers calling frameHandler(), and wg.Add() when each worker starts and wg.Done() when each worker is done.

spdystream/connection.go

Lines 317 to 333 in 57d1ca2

for i := 0; i < FRAME_WORKERS; i++ {
frameQueues[i] = NewPriorityFrameQueue(QUEUE_SIZE)
// Ensure frame queue is drained when connection is closed
go func(frameQueue *PriorityFrameQueue) {
<-s.closeChan
frameQueue.Drain()
}(frameQueues[i])
wg.Add(1)
go func(frameQueue *PriorityFrameQueue) {
// let the WaitGroup know this worker is done
defer wg.Done()
s.frameHandler(frameQueue, newHandler)
}(frameQueues[i])
}

DataFrames queued in a PriorityFrameQueue are processed in frameHandler() and if it is a DataFrame, it is handled by dataFrameHandler() ( = handleDataFrame()).
In the function, it blocks until either the stream is closed or something reads the stream dataChan (e.g. something calls Read() or ReadData())
Otherwise the function can't finish, which means frameHandler() also can't finish, and it never calls wg.Done().

spdystream/connection.go

Lines 598 to 607 in 57d1ca2

if len(frame.Data) > 0 {
stream.dataLock.RLock()
select {
case <-stream.closeChan:
debugMessage("(%p) (%d) Data frame not sent (stream shut down)", stream, stream.streamId)
case stream.dataChan <- frame.Data:
debugMessage("(%p) (%d) Data frame sent", stream, stream.streamId)
}
stream.dataLock.RUnlock()
}

If nothing reads the data from the stream, the only way of blocking it is to close the stream.
However streams in a connection are closed after wg.Wait().

spdystream/connection.go

Lines 394 to 409 in 57d1ca2

wg.Wait()
if goAwayFrame != nil {
s.handleGoAwayFrame(goAwayFrame)
}
// now it's safe to close remote channels and empty s.streams
s.streamCond.L.Lock()
// notify streams that they're now closed, which will
// unblock any stream Read() calls
for _, stream := range s.streams {
stream.closeRemoteChannels()
}
s.streams = make(map[spdy.StreamId]*Stream)
s.streamCond.Broadcast()
s.streamCond.L.Unlock()

The simplest fix is to move stream.closeRemoteChannels() before wg.Wait(), but I'm unsure if it causes other race condition errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant