SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 on the train and test datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity
  • Training Datasets:
    • train
    • test

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Albertdebeauvais/all-MiniLM-L6-v2_bibliographie")
# Run inference
sentences = [
    'RITZ-GUILBERT, Anne (2018), "Les modèles du \'Bréviaire de Marie de Savoie\' par le Maître des \'Vitae Imperatorum\'", dans BORLÉE, Denise (éd.), TERRIER ALIFERIS, Laurence (éd.), Les modèles dans l\'art du Moyen Âge (XIIe-XVe siècles), Turnhout, Brepols (Répertoire iconographique de la littérature du Moyen Âge. Les études du RILMA, 10), p. 109-120',
    'GIL, Marc (2018), "Sources et circulation des modèles dans les arts figurés champenois, vers 1160-1180 : le cas de Notre-Dame-en-Vaux à Châlons-en-Champagne", dans BORLÉE, Denise (éd.), TERRIER ALIFERIS, Laurence (éd.), Les modèles dans l\'art du Moyen Âge (XIIe-XVe siècles), Turnhout, Brepols (Répertoire iconographique de la littérature du Moyen Âge. Les études du RILMA, 10), p. 179-192',
    "LEMAÎTRE, Jean-Loup (éd.) (2005), Un calendrier retrouvé : le calendrier des Heures de Saint-Pierre-du-Queyroix, Ussel, Musée du pays d'Ussel",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5206, 0.1709],
#         [0.5206, 1.0000, 0.2801],
#         [0.1709, 0.2801, 1.0000]])

Evaluation

Metrics

Binary Classification

Metric Value
cosine_accuracy 0.9517
cosine_accuracy_threshold 0.8323
cosine_f1 0.9499
cosine_f1_threshold 0.8267
cosine_precision 0.9391
cosine_recall 0.961
cosine_ap 0.9647
cosine_mcc 0.9036

Training Details

Training Datasets

train

  • Dataset: train
  • Size: 26,107 training samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 1000 samples:
    text1 text2 label
    type string string int
    details
    • min: 5 tokens
    • mean: 66.2 tokens
    • max: 200 tokens
    • min: 4 tokens
    • mean: 63.27 tokens
    • max: 199 tokens
    • 0: ~53.50%
    • 1: ~46.50%
  • Samples:
    text1 text2 label
    FORONDA, François (dir.), BARRALIS, Christine (dir.), SÈRE, Bénédicte (dir.) (2010), Violences souveraines au Moyen Âge. Travaux d'une école historique, Paris (Le nœud gordien) FORONDA, François (2010), "Une image de la violence d'Etat française : la mort de Pierre Ier de Castille", dans FORONDA, François (dir.), BARRALIS, Christine (dir.), SÈRE, Bénédicte (dir.), Violences souveraines au Moyen Âge. Travaux d'une école historique, Paris (Le nœud gordien), p. 249-259 0
    ORTOLEVA, Vincenzo (1994), "La cosiddetta tradizione "epitomata della Mulomedicina" di Vegezio. Recensio deterior o tradizionz indiretta ?", Revue d'histoire des textes, 24, p. 271-274 TOSCANO, Gennaro (1995), "Il Maestro di Isabella di Chiaromonte : note sulla miniatura a Napoli a metà Quattracento", Artes, 3, p. 34-45 0
    pp. XIV, 259-262, 264-265 John LOWDEN, The Making of the Bibles moralisées. T. 2 : The Book of Ruth, University Park, The Pennsylvania State University, 2000 Mss. [ 4° Impr. 2422 (2) LOWDEN, John (2000), The Making of the Bibles Moralisées : I. The manuscripts; II. The book of Ruth, University Park (PA), The Pennsylvania State University Press 1
  • Loss: OnlineContrastiveLoss

test

  • Dataset: test
  • Size: 808 training samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 808 samples:
    text1 text2 label
    type string string int
    details
    • min: 12 tokens
    • mean: 67.31 tokens
    • max: 195 tokens
    • min: 11 tokens
    • mean: 63.32 tokens
    • max: 195 tokens
    • 0: ~52.35%
    • 1: ~47.65%
  • Samples:
    text1 text2 label
    Pfeffer, Wendy, The Oxford Companion to Chaucer, Oxford, Oxford University Press, 2003. Eglal Doss-Quinby, Joan Tasker-Grimbert, Wendy Pfeffer et Elizabeth Aubrey, Song of the Women Trouvères, New Haven/London, Yale University Press, 2001. 0
    Rêves et vie spirituelle d'après Evagre le Pontique, Jérusalem, Presses Universitaires de France, 1969. (1969), « Les songes et la vie quotidienne dans l'Antiquité tardive », Jérusalem, éd. Presses Universitaires de France. 0
    HUCHER, Eugène (éd.) (1875-1878), Le Saint Graal, ou Le Joseph d'Arimathie, première branche des romans de la Table ronde publié d'après des textes et des documents inédits, Le Mans, Ed. Monnoyer, 3 volumes 1875-1878, Le Saint Graal, Ou Le Joseph D'arimathie, Première Branche Des Romans De La Table Ronde Publié D'après Des Textes Et Des Documents Inédits 1
  • Loss: OnlineContrastiveLoss

Evaluation Dataset

Unnamed Dataset

  • Size: 808 evaluation samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 808 samples:
    text1 text2 label
    type string string int
    details
    • min: 12 tokens
    • mean: 67.31 tokens
    • max: 195 tokens
    • min: 11 tokens
    • mean: 63.32 tokens
    • max: 195 tokens
    • 0: ~52.35%
    • 1: ~47.65%
  • Samples:
    text1 text2 label
    Pfeffer, Wendy, The Oxford Companion to Chaucer, Oxford, Oxford University Press, 2003. Eglal Doss-Quinby, Joan Tasker-Grimbert, Wendy Pfeffer et Elizabeth Aubrey, Song of the Women Trouvères, New Haven/London, Yale University Press, 2001. 0
    Rêves et vie spirituelle d'après Evagre le Pontique, Jérusalem, Presses Universitaires de France, 1969. (1969), « Les songes et la vie quotidienne dans l'Antiquité tardive », Jérusalem, éd. Presses Universitaires de France. 0
    HUCHER, Eugène (éd.) (1875-1878), Le Saint Graal, ou Le Joseph d'Arimathie, première branche des romans de la Table ronde publié d'après des textes et des documents inédits, Le Mans, Ed. Monnoyer, 3 volumes 1875-1878, Le Saint Graal, Ou Le Joseph D'arimathie, Première Branche Des Romans De La Table Ronde Publié D'après Des Textes Et Des Documents Inédits 1
  • Loss: OnlineContrastiveLoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 160
  • per_device_eval_batch_size: 160
  • learning_rate: 3e-05
  • warmup_ratio: 0.03

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 160
  • per_device_eval_batch_size: 160
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.03
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss eval_cosine_ap
-1 -1 - - 0.5884
0.0353 6 7.9026 - -
0.0706 12 7.7888 - -
0.1059 18 7.0352 - -
0.1412 24 6.3592 - -
0.1765 30 5.8148 - -
0.2118 36 4.9098 - -
0.2471 42 5.1715 - -
0.2824 48 4.0856 - -
0.3176 54 4.3722 - -
0.3529 60 4.0175 - -
0.3882 66 3.9427 - -
0.4235 72 3.4966 - -
0.4588 78 3.5505 - -
0.4941 84 3.2389 - -
0.5294 90 3.5375 - -
0.5647 96 3.0543 - -
0.6 102 3.0486 - -
0.6353 108 2.5424 - -
0.6706 114 2.9492 - -
0.7059 120 3.353 - -
0.7412 126 2.7673 - -
0.7765 132 2.9456 - -
0.8118 138 2.3598 - -
0.8471 144 2.5187 - -
0.8824 150 2.2102 - -
0.9176 156 2.675 - -
0.9529 162 2.1735 - -
0.9882 168 2.4117 - -
1.0 170 - 2.0486 0.9545
1.0235 174 1.8135 - -
1.0588 180 2.1022 - -
1.0941 186 1.7459 - -
1.1294 192 1.7129 - -
1.1647 198 1.7023 - -
1.2 204 1.8 - -
1.2353 210 1.6906 - -
1.2706 216 2.0856 - -
1.3059 222 1.7216 - -
1.3412 228 1.8287 - -
1.3765 234 2.2071 - -
1.4118 240 1.8617 - -
1.4471 246 1.8148 - -
1.4824 252 1.6976 - -
1.5176 258 1.4774 - -
1.5529 264 1.8896 - -
1.5882 270 1.8389 - -
1.6235 276 2.2744 - -
1.6588 282 1.5614 - -
1.6941 288 1.3118 - -
1.7294 294 1.6211 - -
1.7647 300 1.3294 - -
1.8 306 2.2436 - -
1.8353 312 1.6333 - -
1.8706 318 1.6046 - -
1.9059 324 1.5298 - -
1.9412 330 1.7025 - -
1.9765 336 1.4742 - -
2.0 340 - 1.5898 0.9664
2.0118 342 1.5415 - -
2.0471 348 1.1568 - -
2.0824 354 1.3209 - -
2.1176 360 1.2234 - -
2.1529 366 1.7336 - -
2.1882 372 1.382 - -
2.2235 378 1.665 - -
2.2588 384 1.2707 - -
2.2941 390 1.1796 - -
2.3294 396 1.6894 - -
2.3647 402 1.06 - -
2.4 408 1.0879 - -
2.4353 414 1.2806 - -
2.4706 420 1.6574 - -
2.5059 426 1.5029 - -
2.5412 432 1.3803 - -
2.5765 438 1.2059 - -
2.6118 444 1.7823 - -
2.6471 450 1.2976 - -
2.6824 456 1.6891 - -
2.7176 462 0.9401 - -
2.7529 468 1.1141 - -
2.7882 474 1.1229 - -
2.8235 480 1.137 - -
2.8588 486 1.5186 - -
2.8941 492 1.4301 - -
2.9294 498 1.4644 - -
2.9647 504 0.9985 - -
3.0 510 0.6255 1.5778 0.9647

Framework Versions

  • Python: 3.9.21
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.1
  • PyTorch: 2.8.0+cu129
  • Accelerate: 1.10.1
  • Datasets: 4.1.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
449
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for biblissima/all-MiniLM-L6-v2_bibliographie

Finetuned
(820)
this model

Paper for biblissima/all-MiniLM-L6-v2_bibliographie

Evaluation results