-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch Metrics Handler for Mutable State Metrics #6603
base: main
Are you sure you want to change the base?
Conversation
|
@@ -128,6 +128,7 @@ type ( | |||
eventNotifier events.Notifier | |||
tokenSerializer common.TaskTokenSerializer | |||
metricsHandler metrics.Handler | |||
batchMetricsHandler metrics.BatchMetricsHandler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already pretty annoying to pass around a bunch of these "context objects" like metrics.Handler and log.Logger (sometimes two loggers), so I hate to add more. Is there some way we could do this in the existing metrics.Handler? I guess we don't want to break existing implementations, so maybe the interface extension pattern? (though I know that has problems)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick look @dnr!
Sure, I agree plumbing this around the place is a pain. As you said, I don't want to break the existing interface so extending could work. Alternatively we could wrap the various observability handlers within some unified wrapper so we just have the one thing to pass around and we keep the batch handler distinct from the standard handler.
For the former I'd have BatchMetricsHandler extend metricsHandler and remove the basic metricsHandler from the context objects that were touched in this PR. Over time if and when other batch metrics are added in the codebase the batch handler can replace the old handler.
For the latter we'd have some kind of ObservabilityHandler
containing the 3 types which replaces the Log / Metrics / BatchMetrics handlers in the relevant contexts. This would ultimately mean less things to pass around, the downside being the coupling of Logging with Metrics (though in practice, I'm not sure how much of an issue this would be).
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I discussed the approach with @njo originally, we had a different approach. metrics.Handler
had a StartBatch()
method that returns a BatchHandler
, the BatchHandler
is just a metrics.Handler
and an io.Closer
.
With that approach, you'd have:
batch := metricsHandler.StartBatch()
defer batch.Close()
// Record metrics on the batch
metrics.MutableStateSize.With(batch).Record(int64(stats.TotalSize))
metrics.ExecutionInfoSize.With(batch).Record(int64(stats.ExecutionInfoSize))
metrics.ExecutionStateSize.With(batch).Record(int64(stats.ExecutionStateSize))
The default implementation for a metrics handler would return itself with a nop closer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds reasonable @bergundy. I misunderstood when we talked that we'd be using the same metrics handler interface. To clarify, I'd be extending the existing interface here rather than creating a new Handler type that wraps the old one? So going through and updating our existing handler implementations and leaving folks with a custom handler to add the new batch method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have broken source-compatibility for users who use the server as a library before, but we try pretty hard not to. I feel like this case might be worth it, though, since it's very easy to fix. We should get a team consensus on that.
I like the ObservabilityHandler idea! We might need/want to do some unification of tags
TODOS:
What changed?
Added batch metrics which can be used to emit "wide events" if a suitable backend is available (otherwise fall back to sending individual events for each field). Modified the mutable state metrics to emit a wide event.
Why?
Wide events can reduce the total number of network events (mutable state metrics can now send 1 event rather than 22). The metrics / events can also be correlated and provide more context.
How did you test it?
Ran unit tests.
Potential risks
I'd expect the server wouldn't start at all or throw a NPE when a mutable state event is to be emitted in the worst case.
Documentation
Is hotfix candidate?
No