{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Evaluate Reranker" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reranker usually better captures the latent semantic meanings between sentences. But comparing to using an embedding model, it will take quadratic $O(N^2)$ running time for the whole dataset. Thus the most common use cases of rerankers in information retrieval or RAG is reranking the top k answers retrieved according to the embedding similarities.\n", "\n", "The evaluation of reranker has the similar idea. We compare how much better the rerankers can rerank the candidates searched by a same embedder. In this tutorial, we will evaluate two rerankers' performances on BEIR benchmark, with bge-large-en-v1.5 as the base embedding model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: We highly recommend to run this notebook with GPU. The whole pipeline is very time consuming. For simplicity, we only use a single task FiQA in BEIR." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0. Installation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First install the required dependency" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install FlagEmbedding" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. bge-reranker-large" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first model is bge-reranker-large, a BERT like reranker with about 560M parameters.\n", "\n", "We can use the evaluation pipeline of FlagEmbedding to directly run the whole process:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Split 'dev' not found in the dataset. Removing it from the list.\n", "ignore_identical_ids is set to True. This means that the search results will not contain identical ids. Note: Dataset such as MIRACL should NOT set this to True.\n", "pre tokenize: 100%|██████████| 57/57 [00:03<00:00, 14.68it/s]\n", "You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.\n", "/share/project/xzy/Envs/ft/lib/python3.11/site-packages/_distutils_hack/__init__.py:54: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml\n", " warnings.warn(\n", "Inference Embeddings: 100%|██████████| 57/57 [00:44<00:00, 1.28it/s]\n", "pre tokenize: 100%|██████████| 1/1 [00:00<00:00, 61.59it/s]\n", "Inference Embeddings: 100%|██████████| 1/1 [00:00<00:00, 6.22it/s]\n", "Searching: 100%|██████████| 21/21 [00:00<00:00, 68.26it/s]\n", "pre tokenize: 0%| | 0/64 [00:00