Runner#

class FlagEmbedding.finetune.embedder.encoder_only.m3.EncoderOnlyEmbedderM3Runner(model_args: EncoderOnlyEmbedderM3ModelArguments, data_args: AbsEmbedderDataArguments, training_args: EncoderOnlyEmbedderM3TrainingArguments)[source]#

M3 model runner for finetuning.

Parameters:
static get_model(model_name_or_path: str, trust_remote_code: bool = False, colbert_dim: int = -1, cache_dir: str | None = None)[source]#

Get the model.

Parameters:
  • model_name_or_path (str) – If it’s a path to a local model, it loads the model from the path. Otherwise tries to download and load a model from HuggingFace Hub with the name.

  • trust_remote_code (bool, optional) – trust_remote_code to use when loading models from HF. Defaults to False.

  • colbert_dim (int, optional) – Colbert dim to set. Defaults to -1.

  • cache_dir (str, optional) – HF cache dir to store the model. Defaults to None.

Returns:

A dictionary containing the model, colbert linear and sparse linear.

Return type:

dict

load_tokenizer_and_model() Tuple[PreTrainedTokenizer, AbsEmbedderModel][source]#

Load the tokenizer and model.

Returns:

Tokenizer and model instances.

Return type:

Tuple[PreTrainedTokenizer, AbsEmbedderModel]

load_trainer() EncoderOnlyEmbedderM3Trainer[source]#

Load the M3 trainer.

Returns:

M3 Trainer instance.

Return type:

EncoderOnlyEmbedderM3Trainer