AbsEmbedder#

class FlagEmbedding.abc.inference.AbsEmbedder(model_name_or_path: str, normalize_embeddings: bool = True, use_fp16: bool = True, query_instruction_for_retrieval: str | None = None, query_instruction_format: str = '{}{}', devices: str | int | List[str] | List[int] | None = None, batch_size: int = 256, query_max_length: int = 512, passage_max_length: int = 512, convert_to_numpy: bool = True, **kwargs: Any)[source]#

Base class for embedder. Extend this class and implement encode_queries(), encode_corpus(), encode() for custom embedders.

Parameters:
  • model_name_or_path (str) – If it’s a path to a local model, it loads the model from the path. Otherwise tries to download and load a model from HuggingFace Hub with the name.

  • normalize_embeddings (bool, optional) – If True, normalize the embedding vector. Defaults to True.

  • use_fp16 (bool, optional) – If true, use half-precision floating-point to speed up computation with a slight performance degradation. Defaults to True.

  • query_instruction_for_retrieval – (Optional[str], optional): Query instruction for retrieval tasks, which will be used with with query_instruction_format. Defaults to None.

  • query_instruction_format – (str, optional): The template for query_instruction_for_retrieval. Defaults to "{}{}".

  • devices (Optional[Union[str, int, List[str], List[int]]], optional) – Devices to use for model inference. Defaults to None.

  • batch_size (int, optional) – Batch size for inference. Defaults to 256.

  • query_max_length (int, optional) – Maximum length for query. Defaults to 512.

  • passage_max_length (int, optional) – Maximum length for passage. Defaults to 512.

  • convert_to_numpy (bool, optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to True.

  • kwargs (Dict[Any], optional) – Additional parameters for HuggingFace Transformers config or children classes.

Methods#

static AbsEmbedder.get_target_devices(devices: str | int | List[str] | List[int]) List[str][source]#
Parameters:

devices (Union[str, int, List[str], List[int]]) – specified devices, can be str, int, list of str, or list of int.

Raises:

ValueError – Devices should be a string or an integer or a list of strings or a list of integers.

Returns:

A list of target devices in format.

Return type:

List[str]

static AbsEmbedder.get_detailed_instruct(instruction_format: str, instruction: str, sentence: str)[source]#

Combine the instruction and sentence along with the instruction format.

Parameters:
  • instruction_format (str) – Format for instruction.

  • instruction (str) – The text of instruction.

  • sentence (str) – The sentence to concatenate with.

Returns:

The complete sentence with instruction

Return type:

str

AbsEmbedder.encode_queries(queries: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, **kwargs: Any)[source]#

encode the queries using the instruction if provided.

Parameters:
  • queries (Union[List[str], str]) – Input queries to encode.

  • batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to None.

  • max_length (Optional[int], optional) – Maximum length of tokens. Defaults to None.

  • convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to None.

Returns:

Return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]

AbsEmbedder.encode_corpus(corpus: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, **kwargs: Any)[source]#

encode the corpus using the instruction if provided.

Parameters:
  • corpus (Union[List[str], str]) – Input corpus to encode.

  • batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to None.

  • max_length (Optional[int], optional) – Maximum length of tokens. Defaults to None.

  • convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to None.

Returns:

Return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]

AbsEmbedder.encode(sentences: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, instruction: str | None = None, instruction_format: str | None = None, **kwargs: Any)[source]#

encode the input sentences with the embedding model.

Parameters:
  • sentences (Union[List[str], str]) – Input sentences to encode.

  • batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to None.

  • max_length (Optional[int], optional) – Maximum length of tokens. Defaults to None.

  • convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to None.

  • instruction (Optional[str], optional) – The text of instruction. Defaults to None.

  • instruction_format (Optional[str], optional) – Format for instruction. Defaults to None.

Returns:

return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]

abstract AbsEmbedder.encode_single_device(sentences: List[str] | str, batch_size: int = 256, max_length: int = 512, convert_to_numpy: bool = True, device: str | None = None, **kwargs: Any)[source]#

This method should encode sentences and return embeddings on a single device.

AbsEmbedder.start_multi_process_pool(process_target_func: Any) Dict[Literal['input', 'output', 'processes'], Any][source]#

Starts a multi-process pool to process the encoding with several independent processes via SentenceTransformer.encode_multi_process.

This method is recommended if you want to encode on multiple GPUs or CPUs. It is advised to start only one process per GPU. This method works together with encode_multi_process and stop_multi_process_pool.

Returns:

A dictionary with the target processes, an input queue, and an output queue.

Return type:

Dict[str, Any]

static AbsEmbedder._encode_multi_process_worker(target_device: str, model: AbsEmbedder, input_queue: Queue, results_queue: Queue) None[source]#

Internal working process to encode sentences in multi-process setup

static AbsEmbedder.stop_multi_process_pool(pool: Dict[Literal['input', 'output', 'processes'], Any]) None[source]#

Stops all processes started with start_multi_process_pool.

Parameters:

pool (Dict[str, object]) – A dictionary containing the input queue, output queue, and process list.

Returns:

None

AbsEmbedder.encode_multi_process(sentences: List[str], pool: Dict[Literal['input', 'output', 'processes'], Any], **kwargs)[source]#
AbsEmbedder._concatenate_results_from_multi_process(results_list: List[Tensor | ndarray | Any])[source]#

concatenate and return the results from all the processes

Parameters:

results_list (List[Union[torch.Tensor, np.ndarray, Any]]) – A list of results from all the processes.

Raises:

NotImplementedError – Unsupported type for results_list

Returns:

return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]