-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update metrics to support OpenTelemetry specification #5120
base: main
Are you sure you want to change the base?
Conversation
{ | ||
} | ||
|
||
public abstract bool SupportsDelta { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels confusing that the type is named "MeterInstrumentDeltaMeasurementPayload" and yet it might have SupportsDelta return false. Does the "Delta" in the type name have a different meaning than the Delta in the property name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it on to MeterPayload
with a default of false
. For instruments where delta is supported they now return true
.
|
||
public abstract bool SupportsDelta { get; } | ||
|
||
public virtual void Combine(MeterInstrumentDeltaMeasurementPayload other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now all these payload objects are simple immutable objects that represent data which came across the wire and it would be nice to preserve that immutability. Rather than modify the objects in place perhaps you could track the mutating state in a couple variables you define together with the for loop iterating over them? That seems simpler and clearer to me than trying to reuse the payload object also as an accumulator for different subsets of the properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dropped the combine concept. I do like the immutability of these objects too. They can mutate DisplayName it seems but still 😄 Hard to know exactly how the aggregation will work. I took a look at my PoC. At the moment it passes a payload type. So it will have to allocate a new one with the final value. But that may not be the end of the world, or something else could be possible, or we could do more here. But figured OK to get the good pieces in so it can keep moving/evolving.
|
||
public Quantile[] Quantiles { get; private set; } | ||
|
||
public override bool SupportsDelta => true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does doing delta aggregation make sense given that the quantiles can't be combined? I see that you are only doing it for Count and Sum and those do work functionally, so this is more of a broader scenario question if it makes sense to offer this partial information for Histograms vs leaving them out for now and telling people they have to update to a newer version of .NET to make that part of the scenario work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole combine concept is gone for now. I left count/sum on here though. Talking with @alanwest backends like NR can do nice things with just sum/count so I felt it was still worth exposing the data to interested listeners 🤷
This is for eventpipe consumers to be able to do a great job of exposing metrics via Otel, such as dotnet-monitor? |
@samsp-msft Yes. There's 3 PRs open, one for each signal. This is coming out of the PoC I did for dotnet-monitor to support OTel (OTLP) egress. Basically getting the pipelines into shape. |
@@ -381,11 +417,11 @@ private static void HandleUpDownCounterValue(TraceEvent traceEvent, CounterConfi | |||
} | |||
|
|||
CounterMetadata metadata = GetCounterMetadata(meterName, instrumentName, id); | |||
if (double.TryParse(valueText, NumberStyles.Number | NumberStyles.Float, CultureInfo.InvariantCulture, out double value)) | |||
if (double.TryParse(rateText, NumberStyles.Number | NumberStyles.Float, CultureInfo.InvariantCulture, out double rate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the rate value not being present also tied to counter ending events?
This PR expands the metrics artifacts to support OpenTelemetry.
To see how this will be used, I did a proof-of-concept. The final work will change somewhat, but this should explain the changes.
Listener is here: https://github.com/CodeBlanch/dotnet-monitor/blob/b84ba9229f5996d99f934ef862008c138d294b0e/src/Tools/dotnet-monitor/OpenTelemetry/OpenTelemetryService.cs#L374
MetricsStore gains an OTel-style snapshot which understands how to do proper delta aggregation using some of these changes: https://github.com/CodeBlanch/dotnet-monitor/blob/b84ba9229f5996d99f934ef862008c138d294b0e/src/Microsoft.Diagnostics.Monitoring.WebApi/Metrics/MetricsStore.cs#L161-L178
The aggregated data is exported periodically as a "metrics producer" in the OTel world: https://github.com/CodeBlanch/dotnet-monitor/blob/b84ba9229f5996d99f934ef862008c138d294b0e/src/Tools/dotnet-monitor/OpenTelemetry/OpenTelemetryService.cs#L378
/cc @noahfalk @samsp-msft @rajkumar-rangaraj @wiktork