Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix metrics data race in the Engine test run finalization #1888

Merged
merged 1 commit into from
Mar 8, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions core/engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -328,6 +328,15 @@ func (e *Engine) processMetrics(globalCtx context.Context, processMetricsAfterRu
case <-ticker.C:
processSamples()
case <-processMetricsAfterRun:
getCachedMetrics:
for {
select {
case sc := <-e.Samples:
sampleContainers = append(sampleContainers, sc)
default:
break getCachedMetrics
}
}
Comment on lines +331 to +339
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the looks of things, this should be called after all the VUs have stopped ... so at least it should not be possible for there to be an endless loop as something keeps adding samples.

Obviously, this won't prevent some extension or badly written internal code to just keep sending metrics without checking the context in a separate goroutine, but I don't think that is such a big problem.

I somewhat would prefer for there to be some timeout to this still, just in case we are wrong or something changes in the future. But I think this can wait

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced a timeout is a good idea here... It shouldn't be needed, and we don't have timeouts for other things that process metrics and might be overwhelmed, so this probably shouldn't have one either. That said, I added this issue and mentioned the potential problem in it: #1889

but for now, unless we observe some issues stemming from this, I'm for not touching it 😅

e.logger.Debug("Processing metrics and thresholds after the test run has ended...")
processSamples()
if !e.runtimeOptions.NoThresholds.Bool {
Expand Down