Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include docs retrieved, docs output, and server time in FeedResponse diagnostics #2121

Closed
chwarr opened this issue Jan 13, 2021 · 6 comments
Closed
Labels
discussion-wanted Need a discussion on an area feature-request New feature or request QUERY

Comments

@chwarr
Copy link
Member

chwarr commented Jan 13, 2021

Is your feature request related to a problem? Please describe.

Our application would like to be able to log and emit metrics about the Cosmos DB queries it executes. We want to log/emit

  • the client observed latency,
  • the service observed latency,
  • the number of documents the query retrieved, and
  • the number of documents the query output.

We want to log/emit these per "page" of the query that we read (each call to FeedIterator.ReadNextAsync). We will aggregate them ourselves per account, per database, per container, and per query spec.

We can get the client observed latency using a Stopwatch or CosmosDiagnostics.GetClientElapsedTime. None of the other values are exposed in the v3 SDK.

We want docs retrieved and docs output details so that we can monitor and alert on how well we are using our indices.

We do not use Azure Monitor and it is not particularly easy for us to integrate with it. I also do not believe that Azure Monitor tracks these metrics per query spec.

Describe the solution you'd like

A well-typed, possibly lazily-parsed, possibly opt-in QueryMetrics property on FeedResponse or something that FeedResponse refers to. E.g., FeedResponse.Diagnostics.QueryMetrics.OutputDocumentCount would be fine.

Describe alternatives you've considered

We have looked at parsing the "QueryMetric" property from FeedResponse.Diagnostics.ToString() and extract the "totalExecutionTimeInMs", "retrievedDocumentCount", and "outputDocumentCount" properties. However, the format of the diagnostics string is non-contractual, as far as we are aware.

Additional context

The v2 SDK exposed this via the FeedResponse<T>.QueryMetrics property.

@j82w j82w added discussion-wanted Need a discussion on an area feature-request New feature or request QUERY labels Jan 14, 2021
@j82w
Copy link
Contributor

j82w commented Jan 14, 2021

What is your goal in recording just those specific values? If you are just looking for indices usage on a query I believe that is a different value.

The entire CosmosDiagnostics is needed to troubleshoot latency in the SDK which is why individual values are not exposed. In v2 SDK many users were just recording a specific field or set of fields, and when an issue was hit it was not possible for the Cosmos DB team to troubleshoot the problem because of the missing information.

@chwarr
Copy link
Member Author

chwarr commented Jan 14, 2021

Our larger goal is to make sure that the RU charge of the queries we execute are reasonable and haven't changed unexpectedly when we deploy new versions of our app or adjust the container's index policy. If a query's RU charge jumps up, we want to be able to make sure that is because of a required logic change, and not because we, say, accidentally changed a WHERE predicate from an indexed column to an unindexed column.

We typically look at the ratio of docs output to docs retrieved. Is that the same as IndexHitRatio? I just saw that metric today when looking again at what the v2 SDK has. If so, then the index hit ratio is will be enough for our alerting.

@timsander1
Copy link
Contributor

Hi @chwarr, would something like this help?

#2097

@bchong95
Copy link
Contributor

bchong95 commented Feb 4, 2021

I think the ask here is for strong contracts / a type system on our diagnostics story. For this scenario we will need to make the following types public:

https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos/src/Tracing/ITrace.cs
https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos/src/Tracing/TraceData/QueryMetricsTraceDatum.cs

@timsander1 can you work with planning folk for this? We can sync on the details offline.

@chwarr
Copy link
Member Author

chwarr commented Feb 4, 2021

@timsander1, skimming #2097, I came to a similar conclusion as @bchong95.

Without the strongly-typed objects exposed, it looks like the "structured" way to consume metrics after #2097 would be to serialize them to JSON and then parse that. If the JSON is specified somewhere, that's better than the unspecified key-value pairs that exist today.

@Maya-Painter
Copy link
Contributor

Maya-Painter commented Sep 13, 2023

Closing as this #4001 was checked in, which exposes these metrics via FeedResponse.Diagnostics.GetQueryMetrics().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion-wanted Need a discussion on an area feature-request New feature or request QUERY
Projects
None yet
Development

No branches or pull requests

5 participants