How to know what are the most expensive queries? #8881
-
Hi team, I want to analyse the query stats that the query frontend log, for example:
Is there anywhere in these stats where I can find the complexity of "finding the IDs that match", as in how hard was to find the IDs that match over the total IDs, before even obtaining the postings and datapoints of the matching series? Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
this isn't something exposed by either the store-gateway or the ingester today. The store-gateway exports an aggregated metric with the amount of time it takes to resolve the matchers to a set of series IDs (
There are some optimizations if the regular expression has a prefix, then the lookup will not match against all possible values, but only the ones that match the prefix. In those cases, the cost can be close to O(logN) I'm not sure what would be the best way to expose this data. Simply recording the time to do regular expression matching might not make much sense if we don't have the full time spent in the storage layer (combined time spent in each ingester & store-gateway). |
Beta Was this translation helpful? Give feedback.
This one in particular would be well welcome. Generally regular expressions with lookbacks (ones with
*
or+
) that are before the end of the query can be very expensive.Another optimization that we have is prefix matching -
db.*
is translated tostrings.HasPrefix("db")
and avoid running a regular expression.One way to verify the impact of regular expressions is to inspect CPU profiles during these queries.
Perhaps worth noting that it may not always be the complexity of the regular expression, but the number of label values of a label. Matching
pod!~"db.*"
against 1M strings …