Skip to content
Alessandro Benedetti edited this page Jun 29, 2018 · 2 revisions

Evaluation measures for an information retrieval system are used to assess how well the search results satisfied the user's query intent.
Such metrics are often split into kinds: online metrics look at users' interactions with the search system, while offline metrics measure relevance, in other words how likely each result, or search engine results page (SERP) page as a whole, is to meet the information needs of the user.

(Wikipedia)

The current RRE release focuses on offline metrics. The following list includes the leaf-level RRE built-in metrics which can be used out of the box. "Leaf" because those metrics are computed at leaf level in the domain model, which means they are computed at query level:

  • Precision: the fraction of retrieved documents that are relevant
  • Recall: the fraction of relevant documents that are retrieved
  • Precision at 1: this metric indicates if the first top result in the list is relevant or not.
  • Precision at 2: same as above but it consider the first two results.
  • Precision at 3: same as above but it consider the first three results.
  • Precision at 10: this metric measures the number of relevant results in the top 10 search results
  • Reciprocal Rank: it is the multiplicative inverse of the rank of the first "correct" answer: 1 for first place, 1/2 for second place, 1/3 for third and so on.
  • Average Precision: the area under the precision-recall curve.
  • NDCG at 10: it is the multiplicative inverse of the rank of the first "correct" answer: 1 for first place, 1/2 for second place, 1/3 for third and so on.

On top of those "leaf" metrics computed at query level, RRE computes them at the upper levels of the domain model (e.g. query group, topic, corpus) using an aggregation function. The result is a new set of metrics with several levels of granularity:

  • Mean Average Precision: the mean of the average precisions computed at query level.
  • Mean Reciprocal Rank: the average of the reciprocal ranks computed at query level.
  • all other metrics listed above aggregared by their arithmetic mean