Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("CatkinChen/BAAI_bge-base-en-v1.5_retrieval_finetuned_2025-04-05_22-01-09")
# Run inference
sentences = [
'Represent this sentence for searching relevant passages: What does Harry discover about the Mirror of Erised in the first book, and how does this relate to a revelation in the seventh book?',
"But the room was empty. Breathing very fast, he turned slowly back to the mirror. There he was, reflected in it, white and scared-looking, and there, reflected behind him, were at least ten others. Harry looked over his shoulder - but still, no one was there. Or were they all invisible, too? Was he in fact in a room full of invisible people and this mirrors trick was that it reflected them, invisible or not? He looked in the mirror again. A woman standing right behind his reflection was smiling at him and waving. He reached out a hand and felt the air behind him. If she was really there, he'd touch her, their reflections were so close together, but he felt only air - she and the others existed only in the mirror.",
'said Ron urgently. "Harry, let\'s go and get it before he does!" "It\'s too late for that," said Harry. He could not help himself, but clutched his head, trying to help it resist. "He knows where it is. He\'s there now." "Harry!" Ron said furiously. "How long have you known this - why have we been wasting time? Why did you talk to Griphook first?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.1111 |
| cosine_accuracy@3 | 0.1852 |
| cosine_accuracy@5 | 0.2346 |
| cosine_accuracy@10 | 0.284 |
| cosine_precision@1 | 0.1111 |
| cosine_precision@3 | 0.0617 |
| cosine_precision@5 | 0.0469 |
| cosine_precision@10 | 0.0296 |
| cosine_recall@1 | 0.0967 |
| cosine_recall@3 | 0.1543 |
| cosine_recall@5 | 0.1852 |
| cosine_recall@10 | 0.2222 |
| cosine_ndcg@10 | 0.1651 |
| cosine_mrr@10 | 0.1637 |
| cosine_map@100 | 0.1449 |
sentence_0, sentence_1, and sentence_2| sentence_0 | sentence_1 | sentence_2 | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| sentence_0 | sentence_1 | sentence_2 |
|---|---|---|
Represent this sentence for searching relevant passages: What is the name of the spell that causes a person to tell the truth? |
For one wild moment, Harry thought Snape was about to pull out his wand and curse him - then he saw that Snape had drawn out a small crystal bottle of a completely clear potion. Harry stared at it. "Do you know what this is, Potter?" Snape said, his eyes glittering dangerously again. "No," said Harry, with complete honesty this time. "It is Veritaserum - a Truth Potion so powerful that three drops would have you spilling your innermost secrets for this entire class to hear," said Snape viciously. "Now, the use of this potion is controlled by very strict Ministry guidelines. But unless you watch your step, you might just find that my hand slips" - he shook the crystal bottle slightly - "right over your evening pumpkin juice. |
FURTHER MISTAKES AT THE MINISTRY OF MAGIC |
Represent this sentence for searching relevant passages: In the sixth book, what does Harry see in the Pensieve about Voldemort's mother, and how does it connect to the second book? |
"Voldemort's grandfather, yes," said Dumbledore. "Marvolo, his son, Morfin, and his daughter, Merope, were the last of the Gaunts, a very ancient Wizarding family noted for a vein of instability and violence that flourished through the generations due to their habit of marrying their own cousins. Lack of sense coupled with a great liking for grandeur meant that the family gold was squandered several generations before Marvolo was born. He, as you saw, was left in squalor and poverty, with a very nasty temper, a fantastic amount of arrogance and pride, and a couple of family heirlooms that he treasured just as much as his son, and rather more than his daughter." "So Merope," said Harry, leaning forward in his chair and staring at Dumbledore, "so Merope was ... Sir, does that mean she was ... Voldemort's mother?" "It does," said Dumbledore. "And it so happens that we also had a glimpse of Voldemort's father. I wonder whether you noticed?" "The Muggle Morfin attacked? The man on the horse... |
Unnoticed by either, he seized the bowl that contained the pod and began to try and open it by the noisiest and most energetic means he could think of; unfortunately, he could still hear every word of their conversation. "You were going to ask me?" asked Ron, in a completely different voice. "Yes," said Hermione angrily. "But obviously if you'd rather I hooked up with McLaggen ..." |
Represent this sentence for searching relevant passages: What does Harry use to destroy the diary in the second book, and how does this object reappear in the seventh book? |
Then, without thinking, without considering, as though he had meant to do it all along, Harry seized the basilisk fang on the floor next to him and plunged it straight into the heart of the book. There was a long, dreadful, piercing scream. Ink spurted out of the diary in torrents, streaming over Harry's hands, flooding the floor. Riddle was writhing and twisting, screaming and flailing and then - |
For a moment he thought she was going to scream at him. Then she said, in her softest, most sweetly girlish voice, "Come here, Mr. Potter, dear." He kicked his chair aside, strode around Ron and Hermione and up to the teacher's desk. He could feel the rest of the class holding its breath. He felt so angry he did not care what happened next. Professor Umbridge pulled a small roll of pink parchment out of her handbag, stretched it out on the desk, dipped her quill into a bottle of ink, and started scribbling, hunched over so that Harry could not see what she was writing. Nobody spoke. After a minute or so she rolled up the parchment and tapped it with her wand; it sealed itself seamlessly so that he could not open it. "Take this to Professor McGonagall, dear," said Professor Umbridge, holding out the note to him. He took it from her without saying a word and left the room, not even looking back at Ron and Hermione, and slamming the classroom door shut behind him. |
TripletLoss with these parameters:{
"distance_metric": "TripletDistanceMetric.COSINE",
"triplet_margin": 0.3
}
eval_strategy: stepsper_device_train_batch_size: 3per_device_eval_batch_size: 3num_train_epochs: 5fp16: Truebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 3per_device_eval_batch_size: 3per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | cosine_ndcg@10 |
|---|---|---|---|
| 0.1274 | 100 | - | 0.1595 |
| 0.2548 | 200 | - | 0.1707 |
| 0.3822 | 300 | - | 0.1955 |
| 0.5096 | 400 | - | 0.2150 |
| 0.6369 | 500 | 0.2609 | 0.2188 |
| 0.7643 | 600 | - | 0.2257 |
| 0.8917 | 700 | - | 0.2277 |
| 1.0 | 785 | - | 0.2003 |
| 1.0191 | 800 | - | 0.1983 |
| 1.1465 | 900 | - | 0.2167 |
| 1.2739 | 1000 | 0.1494 | 0.2241 |
| 1.4013 | 1100 | - | 0.2099 |
| 1.5287 | 1200 | - | 0.2063 |
| 1.6561 | 1300 | - | 0.2068 |
| 1.7834 | 1400 | - | 0.1911 |
| 1.9108 | 1500 | 0.1003 | 0.2010 |
| 2.0 | 1570 | - | 0.2045 |
| 2.0382 | 1600 | - | 0.1993 |
| 2.1656 | 1700 | - | 0.1897 |
| 2.2930 | 1800 | - | 0.1978 |
| 2.4204 | 1900 | - | 0.1852 |
| 2.5478 | 2000 | 0.0669 | 0.1831 |
| 2.6752 | 2100 | - | 0.1894 |
| 2.8025 | 2200 | - | 0.1785 |
| 2.9299 | 2300 | - | 0.1734 |
| 3.0 | 2355 | - | 0.1793 |
| 3.0573 | 2400 | - | 0.1832 |
| 3.1847 | 2500 | 0.0579 | 0.1717 |
| 3.3121 | 2600 | - | 0.1714 |
| 3.4395 | 2700 | - | 0.1665 |
| 3.5669 | 2800 | - | 0.1668 |
| 3.6943 | 2900 | - | 0.1692 |
| 3.8217 | 3000 | 0.0458 | 0.1678 |
| 3.9490 | 3100 | - | 0.1662 |
| 4.0 | 3140 | - | 0.1644 |
| 4.0764 | 3200 | - | 0.1628 |
| 4.2038 | 3300 | - | 0.1631 |
| 4.3312 | 3400 | - | 0.1648 |
| 4.4586 | 3500 | 0.0385 | 0.1658 |
| 4.5860 | 3600 | - | 0.1651 |
| 4.7134 | 3700 | - | 0.1649 |
| 4.8408 | 3800 | - | 0.1649 |
| 4.9682 | 3900 | - | 0.1656 |
| 5.0 | 3925 | - | 0.1651 |
| -1 | -1 | - | 0.1651 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Base model
BAAI/bge-base-en-v1.5