BGE Auto Embedder#
FlagEmbedding provides a high level class FlagAutoModel
that unify the inference of embedding models. Besides BGE series, it also supports other popular open-source embedding models such as E5, GTE, SFR, etc. In this tutorial, we will have an idea how to use it.
% pip install FlagEmbedding
1. Usage#
First, import FlagAutoModel
from FlagEmbedding, and use the from_finetuned()
function to initialize the model:
from FlagEmbedding import FlagAutoModel
model = FlagAutoModel.from_finetuned(
'BAAI/bge-base-en-v1.5',
query_instruction_for_retrieval="Represent this sentence for searching relevant passages: ",
devices="cuda:0", # if not specified, will use all available gpus or cpu when no gpu available
)
Then use the model exactly same to FlagModel
(FlagM3Model
if using BGE M3, FlagLLMModel
if using BGE Multilingual Gemma2, FlagICLModel
if using BGE ICL)
queries = ["query 1", "query 2"]
corpus = ["passage 1", "passage 2"]
# encode the queries and corpus
q_embeddings = model.encode_queries(queries)
p_embeddings = model.encode_corpus(corpus)
# compute the similarity scores
scores = q_embeddings @ p_embeddings.T
print(scores)
You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
[[0.76 0.6714]
[0.6177 0.7603]]
2. Explanation#
FlagAutoModel
use an OrderedDict MODEL_MAPPING
to store all the supported models configuration:
from FlagEmbedding.inference.embedder.model_mapping import AUTO_EMBEDDER_MAPPING
list(AUTO_EMBEDDER_MAPPING.keys())
['bge-en-icl',
'bge-multilingual-gemma2',
'bge-m3',
'bge-large-en-v1.5',
'bge-base-en-v1.5',
'bge-small-en-v1.5',
'bge-large-zh-v1.5',
'bge-base-zh-v1.5',
'bge-small-zh-v1.5',
'bge-large-en',
'bge-base-en',
'bge-small-en',
'bge-large-zh',
'bge-base-zh',
'bge-small-zh',
'e5-mistral-7b-instruct',
'e5-large-v2',
'e5-base-v2',
'e5-small-v2',
'multilingual-e5-large-instruct',
'multilingual-e5-large',
'multilingual-e5-base',
'multilingual-e5-small',
'e5-large',
'e5-base',
'e5-small',
'gte-Qwen2-7B-instruct',
'gte-Qwen2-1.5B-instruct',
'gte-Qwen1.5-7B-instruct',
'gte-multilingual-base',
'gte-large-en-v1.5',
'gte-base-en-v1.5',
'gte-large',
'gte-base',
'gte-small',
'gte-large-zh',
'gte-base-zh',
'gte-small-zh',
'SFR-Embedding-2_R',
'SFR-Embedding-Mistral',
'Linq-Embed-Mistral']
print(AUTO_EMBEDDER_MAPPING['bge-en-icl'])
EmbedderConfig(model_class=<class 'FlagEmbedding.inference.embedder.decoder_only.icl.ICLLLMEmbedder'>, pooling_method=<PoolingMethod.LAST_TOKEN: 'last_token'>, trust_remote_code=False, query_instruction_format='<instruct>{}\n<query>{}')
Taking a look at the value of each key, which is an object of EmbedderConfig
. It consists four attributes:
@dataclass
class EmbedderConfig:
model_class: Type[AbsEmbedder]
pooling_method: PoolingMethod
trust_remote_code: bool = False
query_instruction_format: str = "{}{}"
Not only the BGE series, it supports other models such as E5 similarly:
print(AUTO_EMBEDDER_MAPPING['bge-en-icl'])
EmbedderConfig(model_class=<class 'FlagEmbedding.inference.embedder.decoder_only.icl.ICLLLMEmbedder'>, pooling_method=<PoolingMethod.LAST_TOKEN: 'last_token'>, trust_remote_code=False, query_instruction_format='<instruct>{}\n<query>{}')
3. Customization#
If you want to use your own models through FlagAutoModel
, consider the following steps:
Check the type of your embedding model and choose the appropriate model class, is it an encoder or a decoder?
What kind of pooling method it uses? CLS token, mean pooling, or last token?
Does your model needs
trust_remote_code=Ture
to ran?Is there a query instruction format for retrieval?
After these four attributes are assured, add your model name as the key and corresponding EmbedderConfig as the value to MODEL_MAPPING
. Now have a try!