AbsEmbedder#
- class FlagEmbedding.abc.inference.AbsEmbedder(model_name_or_path: str, normalize_embeddings: bool = True, use_fp16: bool = True, query_instruction_for_retrieval: str | None = None, query_instruction_format: str = '{}{}', devices: str | int | List[str] | List[int] | None = None, batch_size: int = 256, query_max_length: int = 512, passage_max_length: int = 512, convert_to_numpy: bool = True, **kwargs: Any)[source]#
Base class for embedder. Extend this class and implement
encode_queries()
,encode_corpus()
,encode()
for custom embedders.- Parameters:
model_name_or_path (str) – If it’s a path to a local model, it loads the model from the path. Otherwise tries to download and load a model from HuggingFace Hub with the name.
normalize_embeddings (bool, optional) – If True, normalize the embedding vector. Defaults to
True
.use_fp16 (bool, optional) – If true, use half-precision floating-point to speed up computation with a slight performance degradation. Defaults to
True
.query_instruction_for_retrieval – (Optional[str], optional): Query instruction for retrieval tasks, which will be used with with
query_instruction_format
. Defaults toNone
.query_instruction_format – (str, optional): The template for
query_instruction_for_retrieval
. Defaults to"{}{}"
.devices (Optional[Union[str, int, List[str], List[int]]], optional) – Devices to use for model inference. Defaults to
None
.batch_size (int, optional) – Batch size for inference. Defaults to
256
.query_max_length (int, optional) – Maximum length for query. Defaults to
512
.passage_max_length (int, optional) – Maximum length for passage. Defaults to
512
.convert_to_numpy (bool, optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to
True
.kwargs (Dict[Any], optional) – Additional parameters for HuggingFace Transformers config or children classes.
Methods#
- static AbsEmbedder.get_target_devices(devices: str | int | List[str] | List[int]) List[str] [source]#
- Parameters:
devices (Union[str, int, List[str], List[int]]) – specified devices, can be str, int, list of str, or list of int.
- Raises:
ValueError – Devices should be a string or an integer or a list of strings or a list of integers.
- Returns:
A list of target devices in format.
- Return type:
List[str]
- static AbsEmbedder.get_detailed_instruct(instruction_format: str, instruction: str, sentence: str)[source]#
Combine the instruction and sentence along with the instruction format.
- Parameters:
instruction_format (str) – Format for instruction.
instruction (str) – The text of instruction.
sentence (str) – The sentence to concatenate with.
- Returns:
The complete sentence with instruction
- Return type:
str
- AbsEmbedder.encode_queries(queries: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, **kwargs: Any)[source]#
encode the queries using the instruction if provided.
- Parameters:
queries (Union[List[str], str]) – Input queries to encode.
batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to
None
.max_length (Optional[int], optional) – Maximum length of tokens. Defaults to
None
.convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to
None
.
- Returns:
Return the embedding vectors in a numpy array or tensor.
- Return type:
Union[torch.Tensor, np.ndarray]
- AbsEmbedder.encode_corpus(corpus: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, **kwargs: Any)[source]#
encode the corpus using the instruction if provided.
- Parameters:
corpus (Union[List[str], str]) – Input corpus to encode.
batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to
None
.max_length (Optional[int], optional) – Maximum length of tokens. Defaults to
None
.convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to
None
.
- Returns:
Return the embedding vectors in a numpy array or tensor.
- Return type:
Union[torch.Tensor, np.ndarray]
- AbsEmbedder.encode(sentences: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, instruction: str | None = None, instruction_format: str | None = None, **kwargs: Any)[source]#
encode the input sentences with the embedding model.
- Parameters:
sentences (Union[List[str], str]) – Input sentences to encode.
batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to
None
.max_length (Optional[int], optional) – Maximum length of tokens. Defaults to
None
.convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to
None
.instruction (Optional[str], optional) – The text of instruction. Defaults to
None
.instruction_format (Optional[str], optional) – Format for instruction. Defaults to
None
.
- Returns:
return the embedding vectors in a numpy array or tensor.
- Return type:
Union[torch.Tensor, np.ndarray]
- abstract AbsEmbedder.encode_single_device(sentences: List[str] | str, batch_size: int = 256, max_length: int = 512, convert_to_numpy: bool = True, device: str | None = None, **kwargs: Any)[source]#
This method should encode sentences and return embeddings on a single device.
- AbsEmbedder.start_multi_process_pool(process_target_func: Any) Dict[Literal['input', 'output', 'processes'], Any] [source]#
Starts a multi-process pool to process the encoding with several independent processes via
SentenceTransformer.encode_multi_process
.This method is recommended if you want to encode on multiple GPUs or CPUs. It is advised to start only one process per GPU. This method works together with encode_multi_process and stop_multi_process_pool.
- Returns:
A dictionary with the target processes, an input queue, and an output queue.
- Return type:
Dict[str, Any]
- static AbsEmbedder._encode_multi_process_worker(target_device: str, model: AbsEmbedder, input_queue: Queue, results_queue: Queue) None [source]#
Internal working process to encode sentences in multi-process setup
- static AbsEmbedder.stop_multi_process_pool(pool: Dict[Literal['input', 'output', 'processes'], Any]) None [source]#
Stops all processes started with start_multi_process_pool.
- Parameters:
pool (Dict[str, object]) – A dictionary containing the input queue, output queue, and process list.
- Returns:
None
- AbsEmbedder.encode_multi_process(sentences: List[str], pool: Dict[Literal['input', 'output', 'processes'], Any], **kwargs)[source]#
- AbsEmbedder._concatenate_results_from_multi_process(results_list: List[Tensor | ndarray | Any])[source]#
concatenate and return the results from all the processes
- Parameters:
results_list (List[Union[torch.Tensor, np.ndarray, Any]]) – A list of results from all the processes.
- Raises:
NotImplementedError – Unsupported type for results_list
- Returns:
return the embedding vectors in a numpy array or tensor.
- Return type:
Union[torch.Tensor, np.ndarray]