Skip to main content

Top-K Ranking Similarity

If you are building retrieval-augmented generation applications, you may be constantly iterating on the embeddings andv ector search index.

Top-K rankings are evaluated with the following criteria:

  • Top results are more important than bottom-ranked results. A drop from 2nd to 3rd is more significant than a drop from 5th to 6th and so on.
  • A drop from 2nd to 3rd is more important than from 5th to 6th and so on.
  • A specific result not appearing in another list is more indicative of difference than a result dropping in another list as it suggests the ranking itself is greater than K.

A specific result not appearing in another list is more indicative of difference than a result dropping in another list as it suggests the ranking itself is greater than K.

Assert Ranking Similarity

In order to provide a suggestion on how to use ranking similarity, we are looking to build:

from deepeval.test_utils import assert_ranking_similarity

result_list_1 = ["Sentence-1", "Sentence-2"]
result_list_2 = ["Sentence-2", "Sentence-3"]

assert_ranking_similarity(
list1=result_list_1,
list2=result_list_2,
minimum_score=0.3
)

Parameters

  • minimum_score refers to the minimum score for this to be considered simiar ranking

Ranking Similarity As A Metric

You can measure ranking similarity as a metric.

from deepeval.metrics.ranking_similarity import RankingSimilarity
from deepeval.run_test import run_test, assert_test

metric = RankingSimilarity(minimum_score=0.5)
test_case = SearchTestCase(list1, list2)
assert_test(test_cases=test_case, metrics=[metric])

Parameters

  • minimum_score refers to the minimum score for this to be considered simiar ranking

How it is measured

For ensuring top-k ranking similarity, we recommend the gentle introduction into the technique that is being used here.

https://medium.com/ai-in-plain-english/comparing-top-k-rankings-statistically-9adfc9cfc98b