Runner#
- class FlagEmbedding.finetune.embedder.encoder_only.m3.EncoderOnlyEmbedderM3Runner(model_args: EncoderOnlyEmbedderM3ModelArguments, data_args: AbsEmbedderDataArguments, training_args: EncoderOnlyEmbedderM3TrainingArguments)[source]#
M3 model runner for finetuning.
- Parameters:
model_args (EncoderOnlyEmbedderM3ModelArguments) – Model arguments
data_args (AbsEmbedderDataArguments) – Data arguments.
training_args (EncoderOnlyEmbedderM3TrainingArguments) – Training arguments.
- static get_model(model_name_or_path: str, trust_remote_code: bool = False, colbert_dim: int = -1, cache_dir: str | None = None)[source]#
Get the model.
- Parameters:
model_name_or_path (str) – If it’s a path to a local model, it loads the model from the path. Otherwise tries to download and load a model from HuggingFace Hub with the name.
trust_remote_code (bool, optional) – trust_remote_code to use when loading models from HF. Defaults to
False
.colbert_dim (int, optional) – Colbert dim to set. Defaults to
-1
.cache_dir (str, optional) – HF cache dir to store the model. Defaults to
None
.
- Returns:
A dictionary containing the model, colbert linear and sparse linear.
- Return type:
dict
- load_tokenizer_and_model() Tuple[PreTrainedTokenizer, AbsEmbedderModel] [source]#
Load the tokenizer and model.
- Returns:
Tokenizer and model instances.
- Return type:
Tuple[PreTrainedTokenizer, AbsEmbedderModel]
- load_trainer() EncoderOnlyEmbedderM3Trainer [source]#
Load the M3 trainer.
- Returns:
M3 Trainer instance.
- Return type: