Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Metrics Handler for Mutable State Metrics #6603

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

njo
Copy link

@njo njo commented Oct 4, 2024

TODOS:

  • add unit test ensuring mutable state metrics are emitted with the fallback metrics handler
  • document change
  • add tags to a cloned batch handler instance like we do in the metrics handler through the codebase

What changed?

Added batch metrics which can be used to emit "wide events" if a suitable backend is available (otherwise fall back to sending individual events for each field). Modified the mutable state metrics to emit a wide event.

Why?

Wide events can reduce the total number of network events (mutable state metrics can now send 1 event rather than 22). The metrics / events can also be correlated and provide more context.

How did you test it?

Ran unit tests.

Potential risks

I'd expect the server wouldn't start at all or throw a NPE when a mutable state event is to be emitted in the worst case.

Documentation

Is hotfix candidate?

No

@njo njo requested a review from a team as a code owner October 4, 2024 18:22
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@@ -128,6 +128,7 @@ type (
eventNotifier events.Notifier
tokenSerializer common.TaskTokenSerializer
metricsHandler metrics.Handler
batchMetricsHandler metrics.BatchMetricsHandler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already pretty annoying to pass around a bunch of these "context objects" like metrics.Handler and log.Logger (sometimes two loggers), so I hate to add more. Is there some way we could do this in the existing metrics.Handler? I guess we don't want to break existing implementations, so maybe the interface extension pattern? (though I know that has problems)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick look @dnr!

Sure, I agree plumbing this around the place is a pain. As you said, I don't want to break the existing interface so extending could work. Alternatively we could wrap the various observability handlers within some unified wrapper so we just have the one thing to pass around and we keep the batch handler distinct from the standard handler.

For the former I'd have BatchMetricsHandler extend metricsHandler and remove the basic metricsHandler from the context objects that were touched in this PR. Over time if and when other batch metrics are added in the codebase the batch handler can replace the old handler.

For the latter we'd have some kind of ObservabilityHandler containing the 3 types which replaces the Log / Metrics / BatchMetrics handlers in the relevant contexts. This would ultimately mean less things to pass around, the downside being the coupling of Logging with Metrics (though in practice, I'm not sure how much of an issue this would be).

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I discussed the approach with @njo originally, we had a different approach. metrics.Handler had a StartBatch() method that returns a BatchHandler, the BatchHandler is just a metrics.Handler and an io.Closer.

With that approach, you'd have:

batch := metricsHandler.StartBatch()
defer batch.Close()

// Record metrics on the batch
metrics.MutableStateSize.With(batch).Record(int64(stats.TotalSize))
metrics.ExecutionInfoSize.With(batch).Record(int64(stats.ExecutionInfoSize))
metrics.ExecutionStateSize.With(batch).Record(int64(stats.ExecutionStateSize))

The default implementation for a metrics handler would return itself with a nop closer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable @bergundy. I misunderstood when we talked that we'd be using the same metrics handler interface. To clarify, I'd be extending the existing interface here rather than creating a new Handler type that wraps the old one? So going through and updating our existing handler implementations and leaving folks with a custom handler to add the new batch method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have broken source-compatibility for users who use the server as a library before, but we try pretty hard not to. I feel like this case might be worth it, though, since it's very easy to fix. We should get a team consensus on that.

I like the ObservabilityHandler idea! We might need/want to do some unification of tags

@njo njo changed the title Mutable spans Batch Metrics Handler for Mutable Span Metrics Oct 4, 2024
@njo njo changed the title Batch Metrics Handler for Mutable Span Metrics Batch Metrics Handler for Mutable State Metrics Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants