-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Custom Metric Attributes Per Request #6281
base: main
Are you sure you want to change the base?
Conversation
370d0b2
to
737f700
Compare
See also #6092 (which is still a draft). |
Could we follow the same pattern as we're doing for otelhttp? |
737f700
to
e20e5cf
Compare
@dmathieu pushed up changes. Had problems getting the |
@dmathieu would the ability to access the We have a use case where we would only know what attributes to add to the metrics while we are in the handler after db / network calls. |
I don't know if we should make metrics so permissive. By their nature, metrics must be low cardinality. |
Yeah that's fair @dmathieu. How's this PR looking in general? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend running the tests locally to ensure everything passes and help with reviews 😸
if f := c.MetricAttributesFn; f != nil { | ||
attrs := f(ctx, rs.Payload) | ||
|
||
gctx.metricAttrs = append(gctx.metricAttrs, attrs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like we need to modify gctx
here?
This also causes a race problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the #6092 also modified this, otherwise the metric attributes won't be propagated on the *stats.End
call where the majority of the metrics are recorded, since I don't think *stats.End
has access to the Payload although I will check that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InPayload
is the only stats.RPCStats
implementation that has access to the request payload. So the this is only time we can call the MetricAttributesFn
to be able to append too gRPCContext.metricAttrs
, I can put a lock around metricAttrs
protect against races?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very hesitant about updating a pointer to struct that is stored as a context value. That really looks like a smell.
Unfortunately, we can't do things cleanly (provide a new context with the updated value), as we can't provide a new context as return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this doesn't seem like it's going to be possible then due to the limitations in how the stats.Handler
works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmathieu any thoughts on if we should continue with this or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to have other opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll leave it with you @dmathieu 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any updates on this?
I did, but I only had issues with one test as I mentioned here: #6281 (comment) |
Coming back at this, I don't think it's a good thing to add. Metrics should be low cardinality, and this really opens a big window for folks to shoot them in the foot with high cardinality metrics. Unless there are objections, I will close this PR in 24 hours. |
IMHO developers should decide if the increased cardinality is justified or not. Maybe I'm doing it wrong but in my experience it's much cheaper to add an attribute to a metric than trace every request if you want a comprehensive view of your requests. There's an issue and 2 PRs now trying to solve this so there is clearly demand for the capability. I've currently forked this statshandler in my code to add an extra attribute which is less than ideal. |
What attributes are you adding? |
These are domain/service specific attributes in the context. In the case of our monitoring platform (Datadog) we still has the ability to choose to exclude tags from being indexed, so the cardinality and cost increase can be managed quite easily. OTOH enabling tracing for every request to use trace metric is prohibitively expensive at scale, hence the preference to do this as a metric attribute. |
If there are domain/service specific attributes than maybe you can create domain/service specific metrics that would use these attributes instead asking to add custom attributes to instrumentation libraries which are supposed to follow the OpenTelemetry Semantic Conventions.
There may be other ways addressing your use case. |
IIUC you're saying that RPC metrics must never have any attributes other than those listed in the semantic conventions?
Yes recommendations are welcome, I believe these current set of PRs exist because the view was that this was an acceptable solution since it was also done for the http handler. |
I do understand that there are use cases where users may want to add custom attributes. E.g. see: open-telemetry/opentelemetry-specification#4298. However, I am not sure if these capabilities (to add explicitly add additional attributes) should be included to the instrumentation libraries. It may be an SDK capability. I would not say must never, but I would just avoid if possible. Nothing in OpenTelemetry Specification says that instrumentation libraries should offer such capability. Do you know if any other languages (e.g. Java, .NET, Python) allow adding custom metrics to the instrumentation libraries? As far as I remember it is not possible in .NET.
I am not sure if this is actually good that it was added. Notice that |
Appreciate the response @pellared We seem to already have a few ways to add metric attributes:
Some making a clear distinction as to why per-request metric attributes are a bridge to far vs resource-wide attributes would help the case for closing this PR and issue. |
I think it comes down to the purpose of these instrumentation libraries. Are they meant to be the one stop shop where we can come and get the officially maintained instrumentation from OTEL? Or is it mean to be the bare minimum instrumentation that is meant to be an example of how you could do HTTP/gRPC instrumentation. If the former then there is clearly a use case where attributes are not known until the request is handled and the SDK should provide a way to do that. Regardless of whether we shoot ourselves in the foot with cardinality. I don't think limitations in other languages should influence the feature set of another. If the latter then that should be it should be made clear in the documentation and it's up to Platform Engineers such as myself to use this as a template to provide the instrumentation to the teams I support. Both are fine approaches and I'm totally down for writing my own implementation to service the teams I support. Of course I would much rather use an officially maintained SDK, takes the weight off of me lol. |
This. I want to get a full understanding (and Go SIG as well as Spec SIG agreement) on what it the scope (and constrains) of instrumentation libraries. Thus I started a discussion in OTel Specification. I want to document what are the common expectations from instrumentation libraries and consistent usage experience across instrumentation libraries in OTel Go Contrib. It should also make contributions easier. |
Based on today's OTel Spec SIG meeting: |
// If a MetricAttributesFn is defined call this function and update the gRPCContext with the metric attributes | ||
// returned by ths function. | ||
if f := c.MetricAttributesFn; f != nil { | ||
attrs := f(ctx, rs.Payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about stats.OutPayload
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another PR similar to this one had different functions for each rs.(type)
, we could go down that road or one function where the caller would have to type assert themselves.
Does that mean this PR is good to continue progressing with? I'm on vacation atm but I can pick it back up next week if so. |
This
We discussed it during last Go SIG meeting. I am on PTO as well but once I back I plan to create an issue (or even project) about designing and standarizing an approach for instrumentation libraries' customization. Are you interested into contributing? We lack velocity at this moment to work on it more actively. If so please consider joining our next Go SIG meeting. |
Yeah I am totally up for contributing, when is the next SIG? |
Any news on this? |
What
Adds a new
WithMetricsAttributesFn
option which allows custom metric attributes to be added on a per request basis.Why
So standard metrics provided by the gRPC instrumentation package can be annotated on a per request basis. Our use case is to add attributes to the instruments depending on what the handler is doing with the request.