evaluator#

class FlagEmbedding.evaluation.mkqa.MKQAEvaluator(eval_name: str, data_loader: AbsEvalDataLoader, overwrite: bool = False)[source]#

The evaluator class of MKQA.

static compute_metrics(corpus_dict: Dict[str, str], qrels: Dict[str, List[str]], search_results: Dict[str, Dict[str, float]], k_values: List[int])[source]#

Compute Recall@k for QA task. The definition of recall in QA task is different from the one in IR task. Please refer to the paper of RocketQA: https://aclanthology.org/2021.naacl-main.466.pdf.

Parameters:
  • corpus_dict (Dict[str, str]) – Dictionary of the corpus with doc id and contents.

  • qrels (Dict[str, List[str]]) – Relevances of queries and passage.

  • search_results (Dict[str, Dict[str, float]]) – Search results of the model to evaluate.

Returns:

The model’s scores of the metrics.

Return type:

dict

evaluate_results(search_results_save_dir: str, k_values: List[int] = [1, 3, 5, 10, 100, 1000])[source]#

Compute the metrics and get the eval results.

Parameters:
  • search_results_save_dir (str) – Directory that saves the search results.

  • k_values (List[int], optional) – Cutoffs. Defaults to [1, 3, 5, 10, 100, 1000].

Returns:

The evaluation results.

Return type:

dict

get_corpus_embd_save_dir(retriever_name: str, corpus_embd_save_dir: str | None = None, dataset_name: str | None = None)[source]#

Get the directory to save the corpus embedding.

Parameters:
  • retriever_name (str) – Name of the retriever.

  • corpus_embd_save_dir (Optional[str], optional) – Directory to save the corpus embedding. Defaults to None.

  • dataset_name (Optional[str], optional) – Name of the dataset. Defaults to None.

Returns:

The final directory to save the corpus embedding.

Return type:

str