ICLLLMEmbedder#

class FlagEmbedding.inference.embedder.decoder_only.icl.ICLLLMEmbedder(model_name_or_path: str, normalize_embeddings: bool = True, use_fp16: bool = True, query_instruction_for_retrieval: str | None = None, query_instruction_format: str = '<instruct>{}\n<query>{}', suffix: str = '\n<response>', devices: None | str | List[str] = None, examples_for_task: List[dict] | None = None, examples_instruction_format: str = '<instruct>{}\n<query>{}\n<response>{}', trust_remote_code: bool = False, cache_dir: str | None = None, batch_size: int = 256, query_max_length: int = 512, passage_max_length: int = 512, convert_to_numpy: bool = True, **kwargs: Any)[source]#

Embedder class for BGE-EN-icl.

Parameters:
  • model_name_or_path (str) – If it’s a path to a local model, it loads the model from the path. Otherwise tries to download and load a model from HuggingFace Hub with the name.

  • normalize_embeddings (bool, optional) – If True, normalize the embedding vector. Defaults to True.

  • use_fp16 (bool, optional) – degradation. Defaults to True.

  • query_instruction_for_retrieval (Optional[str], optional) – Query instruction for retrieval tasks, which will be used with with query_instruction_format. Defaults to None.

  • query_instruction_format (str, optional) – The template for query_instruction_for_retrieval. Defaults to "{}{}".

  • devices (Optional[Union[str, int, List[str], List[int]]], optional) – Devices to use for model inference. Defaults to None.

  • examples_for_task (Optional[List[dict]], optional) – Few-shot examples for the model to enhance model’s ability. Defaults to None.

  • examples_instruction_format (str, optional) – Example format when using examples_for_task.

  • trust_remote_code (bool, optional) – trust_remote_code for HF datasets or models. Defaults to False.

  • cache_dir (Optional[str], optional) – Cache directory for the model. Defaults to None.

  • batch_size (int, optional) – Batch size for inference. Defaults to 256.

  • query_max_length (int, optional) – Maximum length for query. Defaults to 512.

  • passage_max_length (int, optional) – Maximum length for passage. Defaults to 512.

  • convert_to_numpy (bool, optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to True.

DEFAULT_POOLING_METHOD#

The default pooling method when running the model.

Methods#

ICLLLMEmbedder.encode_queries(queries: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, **kwargs: Any) ndarray | Tensor[source]#

Encode the queries.

Parameters:
  • queries (Union[List[str], str]) – Input queries to encode.

  • batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to None.

  • max_length (Optional[int], optional) – Maximum length of tokens. Defaults to None.

  • convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to None.

Returns:

Return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]

ICLLLMEmbedder.encode_corpus(corpus: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, **kwargs: Any) ndarray | Tensor[source]#

Encode the corpus.

Parameters:
  • corpus (Union[List[str], str]) – Input corpus to encode.

  • batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to None.

  • max_length (Optional[int], optional) – Maximum length of tokens. Defaults to None.

  • convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to None.

Returns:

Return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]

ICLLLMEmbedder.encode(sentences: List[str] | str, batch_size: int | None = None, max_length: int | None = None, convert_to_numpy: bool | None = None, **kwargs: Any) ndarray | Tensor[source]#

Encode the input sentences with the embedding model.

Parameters:
  • sentences (Union[List[str], str]) – Input sentences to encode.

  • batch_size (Optional[int], optional) – Number of sentences for each iter. Defaults to None.

  • max_length (Optional[int], optional) – Maximum length of tokens. Defaults to None.

  • convert_to_numpy (Optional[bool], optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to None.

Returns:

return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]

ICLLLMEmbedder.set_examples(examples_for_task: List[dict] | None = None)[source]#

Set the prefix to the provided examples.

Parameters:

examples_for_task (Optional[List[dict]], optional) – Few-shot examples for the model to enhance model’s ability. Defaults to None.

static ICLLLMEmbedder.get_detailed_example(instruction_format: str, instruction: str, query: str, response: str)[source]#

Combine the instruction and sentence along with the instruction format.

Parameters:
  • instruction_format (str) – Format for instruction.

  • instruction (str) – The text of instruction.

  • query (str) – The text of example query.

  • response (str) – The text of example response.

Returns:

The complete example following the given format.

Return type:

str

ICLLLMEmbedder.encode_queries_single_device(queries: List[str] | str, batch_size: int = 256, max_length: int = 512, convert_to_numpy: bool = True, device: str | None = None, **kwargs: Any)[source]#

Encode queries by a single device.

Parameters:
  • queries (Union[List[str], str]) – Input queries to encode.

  • batch_size (int, optional) – Number of queries for each iter. Defaults to 256.

  • max_length (int, optional) – Maximum length of tokens. Defaults to 512.

  • convert_to_numpy (bool, optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to True.

  • device (Optional[str], optional) – Device to use for encoding. Defaults to None.

Returns:

return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]

ICLLLMEmbedder.encode_single_device(sentences: List[str] | str, batch_size: int = 256, max_length: int = 512, convert_to_numpy: bool = True, device: str | None = None, **kwargs: Any)[source]#

Encode input sentences by a single device.

Parameters:
  • sentences (Union[List[str], str]) – Input sentences to encode.

  • batch_size (int, optional) – Number of sentences for each iter. Defaults to 256.

  • max_length (int, optional) – Maximum length of tokens. Defaults to 512.

  • convert_to_numpy (bool, optional) – If True, the output embedding will be a Numpy array. Otherwise, it will be a Torch Tensor. Defaults to True.

  • device (Optional[str], optional) – Device to use for encoding. Defaults to None.

Returns:

return the embedding vectors in a numpy array or tensor.

Return type:

Union[torch.Tensor, np.ndarray]