Runner#

class FlagEmbedding.finetune.embedder.encoder_only.m3.EncoderOnlyEmbedderM3Runner(model_args: EncoderOnlyEmbedderM3ModelArguments, data_args: AbsEmbedderDataArguments, training_args: EncoderOnlyEmbedderM3TrainingArguments)[source]#

M3 model runner for finetuning.

Parameters:

model_args (EncoderOnlyEmbedderM3ModelArguments) – Model arguments
data_args (AbsEmbedderDataArguments) – Data arguments.
training_args (EncoderOnlyEmbedderM3TrainingArguments) – Training arguments.

static get_model(model_name_or_path: str, trust_remote_code: bool = False, colbert_dim: int = -1, cache_dir: str = None)[source]#

Get the model.

Parameters:

model_name_or_path (str) – If it’s a path to a local model, it loads the model from the path. Otherwise tries to download and load a model from HuggingFace Hub with the name.
trust_remote_code (bool, optional) – trust_remote_code to use when loading models from HF. Defaults to False.
colbert_dim (int, optional) – Colbert dim to set. Defaults to -1.
cache_dir (str, optional) – HF cache dir to store the model. Defaults to None.

Returns:

A dictionary containing the model, colbert linear and sparse linear.

Return type:

dict

load_tokenizer_and_model() → Tuple[PreTrainedTokenizer, AbsEmbedderModel][source]#

Load the tokenizer and model.

Load the M3 trainer.