Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("aritrasen/bge-base-en-v1.5-ft_ragds")
# Run inference
sentences = [
'PSY’s “Gangnam Style” T-Shirt Sold on German Online Store\nPSY’s “Gangnam Style” took the U.S. by storm last week, and now, it’s reached a German online shopping mall as well.\nRecently, an online t-shirt store, “Spreadshirt,” revealed a new product inspired by PSY’s “Gangnam Style.” The shirt comes with a picture of PSY’s signature “horse dance,” and lines that say, “Keep Calm and Gangnam Style.” The “Keep Calm” design is one of “Spreadshirt’s” most popular items, and the PSY’s edition is the latest one to come from the highly successful online store.\nIt’s unclear how many copies of the PSY’s shirt have sold out so far, but Korean press and netizens are taking it as a reflection of how popular and viral “Gangnam Style” has gone over the past week.\nNetizens commented, “’Gangnam Style’ is daebak,” “I need to order that shirt now,” and “I wonder who designed that.”\nWith over 300 employees, the Geremany-based “Spreadshirt” is one of the fastest growing and largest online t-shirt retailers. It is expected to reach $100 million in sales this year.\nYou can order your own “Keep Calm and Gangnam Style” shirt here!',
'What is the design on the new product inspired by PSY’s “Gangnam Style” sold on the German online store "Spreadshirt"?',
'Why is Talbots Inc. closing its Fashion Valley store?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
positive and anchor| positive | anchor | |
|---|---|---|
| type | string | string |
| details |
|
|
| positive | anchor |
|---|---|
Caption: Tasmanian berry grower Nic Hansen showing Macau chef Antimo Merone around his property as part of export engagement activities. |
What is the Berry Export Summary 2028 and what is its purpose? |
RWSN Collaborations |
What are some of the benefits reported from having access to Self-supply water sources? |
All Android applications categories |
What are the unique features of the Coolands for Twitter app? |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
positive and anchor| positive | anchor | |
|---|---|---|
| type | string | string |
| details |
|
|
| positive | anchor |
|---|---|
Perhaps Not such a Good Idea |
What is the author's personal view on DaveScot's blog persona? |
Age reduction Academic atmosphere Beef tendon bottom Straight buckle low-heel cowhide Lefu shoes Mary Jane shoes Spring and summer Women's shoes 0.73 |
What type of shoes are mentioned as being suitable for both men and women? |
I just started a new blog on my ultralight gear. My gear list in all it's glory is located on: each item of gear, I'm writing an in-depth review for the item and how we have used it. Would love to get feedback and the site and our gear and/or comments from people on how we can fine tune.Currently my wifes pack is 7.5 lbs base weight, and mine is 10.5 lbs.Thanks!-Brett |
What are the base weights of the blogger's and his wife's packs? |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 10per_device_eval_batch_size: 10num_train_epochs: 1warmup_ratio: 0.1fp16: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 10per_device_eval_batch_size: 10per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | loss |
|---|---|---|---|
| 0.0104 | 10 | 0.1231 | 0.0729 |
| 0.0208 | 20 | 0.0943 | 0.0501 |
| 0.0312 | 30 | 0.0432 | 0.0337 |
| 0.0417 | 40 | 0.1307 | 0.0247 |
| 0.0521 | 50 | 0.0191 | - |
| 0.1042 | 100 | 0.0558 | 0.0188 |
| 0.1562 | 150 | 0.0354 | - |
| 0.2083 | 200 | 0.0623 | 0.0178 |
| 0.2604 | 250 | 0.0692 | - |
| 0.3125 | 300 | 0.0428 | 0.0193 |
| 0.3646 | 350 | 0.0507 | - |
| 0.4167 | 400 | 0.0521 | 0.0250 |
| 0.4688 | 450 | 0.0352 | - |
| 0.5208 | 500 | 0.0285 | 0.0179 |
| 0.5729 | 550 | 0.0428 | - |
| 0.625 | 600 | 0.0315 | 0.0183 |
| 0.6771 | 650 | 0.0363 | - |
| 0.7292 | 700 | 0.0362 | 0.0167 |
| 0.7812 | 750 | 0.0288 | - |
| 0.8333 | 800 | 0.0211 | 0.0128 |
| 0.8854 | 850 | 0.0498 | - |
| 0.9375 | 900 | 0.0316 | 0.0138 |
| 0.9896 | 950 | 0.0336 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
BAAI/bge-base-en-v1.5