Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from sentence-transformers/LaBSE. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
(3): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
"Word: a l o o h l| Context: I i ' n i i y a t s h l p i p e - - n i i g y a ' a w i l s g i h l p i p e a l o o h l h a ' n i i y o ' o x s x w .| Translation: And I hit the pipe-- I saw there was a pipe on the sink.",
'Morpheme: h l | Gloss: CN',
'Morpheme: i i | Gloss: CCNJ',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6834, 0.3528],
# [0.6834, 1.0000, 0.4257],
# [0.3528, 0.4257, 1.0000]])
validationmain.IREvaluatorWithLogging| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.7584 |
| cosine_accuracy@3 | 0.8959 |
| cosine_accuracy@5 | 0.9331 |
| cosine_accuracy@10 | 0.9591 |
| cosine_precision@1 | 0.7584 |
| cosine_precision@3 | 0.3656 |
| cosine_precision@5 | 0.2335 |
| cosine_precision@10 | 0.1223 |
| cosine_recall@1 | 0.6247 |
| cosine_recall@3 | 0.8255 |
| cosine_recall@5 | 0.8721 |
| cosine_recall@10 | 0.9062 |
| cosine_ndcg@10 | 0.8269 |
| cosine_mrr@10 | 0.8325 |
| cosine_map@100 | 0.7885 |
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
Word: h l a g ̲ o o k ̲| Context: I i h l a g ̲ o o k ̲ d i m h a ' w i ' y i i k y ' a a i s x w i ' y g ̲ o o h l w i l p x s e e k ̲ .| Translation: And before I went home I had a short pee in the bathroom. |
Morpheme: g ̲ o o k ̲ | Gloss: first |
90 |
Word: x s a ' a k ̲ x w i ' y| Context: H l a a x s a ' a k ̲ x w i ' y ' n i i g ̲ a y o o t s ' i m i l t ' a a h l i h l j a b i ' y g ̲ o o h l t s ' i m w i l p x s e e k ̲ .| Translation: When I made it out, then I put what I had done (the rubble) back in the bathroom. |
Morpheme: x s i | Gloss: out |
228 |
Word: n e e d i i| Context: I i ' n a k w h l ' w i h l w i l i ' m , g w i l a ' l h l g ̲ a n u u t x w , g ̲ a n w i h l n e e d i i l a x ̲ ' n i s x w i ' y g ̲ o o h l G i g e e n i x .| Translation: And we were away a long time, three weeks, and that's why I didn't hear from Gigeenix. |
Morpheme: n e e | Gloss: NEG |
67 |
main.LossLoggereval_strategy: stepsper_device_train_batch_size: 128per_device_eval_batch_size: 128num_train_epochs: 1000fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1000max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | validation_cosine_ndcg@10 |
|---|---|---|---|
| 1.0 | 4 | - | 0.0647 |
| 2.0 | 8 | - | 0.0761 |
| 3.0 | 12 | - | 0.1236 |
| 4.0 | 16 | - | 0.2528 |
| 5.0 | 20 | - | 0.3902 |
| 6.0 | 24 | - | 0.4764 |
| 7.0 | 28 | - | 0.5325 |
| 8.0 | 32 | - | 0.6067 |
| 9.0 | 36 | - | 0.6709 |
| 10.0 | 40 | - | 0.7043 |
| 11.0 | 44 | - | 0.7018 |
| 12.0 | 48 | - | 0.6915 |
| 13.0 | 52 | - | 0.7073 |
| 14.0 | 56 | - | 0.7310 |
| 15.0 | 60 | - | 0.7335 |
| 16.0 | 64 | - | 0.7389 |
| 17.0 | 68 | - | 0.7586 |
| 18.0 | 72 | - | 0.7615 |
| 19.0 | 76 | - | 0.7586 |
| 20.0 | 80 | - | 0.7472 |
| 21.0 | 84 | - | 0.7588 |
| 22.0 | 88 | - | 0.7641 |
| 23.0 | 92 | - | 0.7740 |
| 24.0 | 96 | - | 0.7633 |
| 25.0 | 100 | - | 0.7721 |
| 26.0 | 104 | - | 0.7669 |
| 27.0 | 108 | - | 0.7728 |
| 28.0 | 112 | - | 0.7868 |
| 29.0 | 116 | - | 0.7735 |
| 30.0 | 120 | - | 0.7829 |
| 31.0 | 124 | - | 0.7937 |
| 32.0 | 128 | - | 0.7902 |
| 33.0 | 132 | - | 0.7656 |
| 34.0 | 136 | - | 0.7838 |
| 35.0 | 140 | - | 0.7821 |
| 36.0 | 144 | - | 0.7871 |
| 37.0 | 148 | - | 0.7869 |
| 38.0 | 152 | - | 0.7920 |
| 39.0 | 156 | - | 0.7905 |
| 40.0 | 160 | - | 0.7954 |
| 41.0 | 164 | - | 0.7966 |
| 42.0 | 168 | - | 0.7835 |
| 43.0 | 172 | - | 0.7800 |
| 44.0 | 176 | - | 0.8047 |
| 45.0 | 180 | - | 0.7990 |
| 46.0 | 184 | - | 0.7860 |
| 47.0 | 188 | - | 0.7891 |
| 48.0 | 192 | - | 0.7958 |
| 49.0 | 196 | - | 0.7813 |
| 50.0 | 200 | - | 0.7778 |
| 51.0 | 204 | - | 0.8001 |
| 52.0 | 208 | - | 0.7870 |
| 53.0 | 212 | - | 0.8027 |
| 54.0 | 216 | - | 0.7905 |
| 55.0 | 220 | - | 0.7827 |
| 56.0 | 224 | - | 0.8020 |
| 57.0 | 228 | - | 0.7919 |
| 58.0 | 232 | - | 0.7817 |
| 59.0 | 236 | - | 0.7994 |
| 60.0 | 240 | - | 0.8164 |
| 61.0 | 244 | - | 0.7788 |
| 62.0 | 248 | - | 0.7900 |
| 63.0 | 252 | - | 0.8173 |
| 64.0 | 256 | - | 0.7976 |
| 65.0 | 260 | - | 0.7905 |
| 66.0 | 264 | - | 0.7923 |
| 67.0 | 268 | - | 0.8071 |
| 68.0 | 272 | - | 0.7958 |
| 69.0 | 276 | - | 0.7871 |
| 70.0 | 280 | - | 0.8020 |
| 71.0 | 284 | - | 0.8103 |
| 72.0 | 288 | - | 0.8123 |
| 73.0 | 292 | - | 0.8118 |
| 74.0 | 296 | - | 0.7934 |
| 75.0 | 300 | - | 0.7882 |
| 76.0 | 304 | - | 0.8015 |
| 77.0 | 308 | - | 0.8201 |
| 78.0 | 312 | - | 0.8240 |
| 79.0 | 316 | - | 0.7994 |
| 80.0 | 320 | - | 0.8042 |
| 81.0 | 324 | - | 0.8114 |
| 82.0 | 328 | - | 0.8100 |
| 83.0 | 332 | - | 0.8041 |
| 84.0 | 336 | - | 0.8179 |
| 85.0 | 340 | - | 0.8197 |
| 86.0 | 344 | - | 0.7973 |
| 87.0 | 348 | - | 0.7985 |
| 88.0 | 352 | - | 0.8123 |
| 89.0 | 356 | - | 0.7997 |
| 90.0 | 360 | - | 0.8043 |
| 91.0 | 364 | - | 0.8057 |
| 92.0 | 368 | - | 0.7991 |
| 93.0 | 372 | - | 0.7983 |
| 94.0 | 376 | - | 0.8052 |
| 95.0 | 380 | - | 0.8026 |
| 96.0 | 384 | - | 0.8109 |
| 97.0 | 388 | - | 0.7929 |
| 98.0 | 392 | - | 0.8025 |
| 99.0 | 396 | - | 0.8218 |
| 100.0 | 400 | - | 0.8194 |
| 101.0 | 404 | - | 0.8023 |
| 102.0 | 408 | - | 0.8099 |
| 103.0 | 412 | - | 0.8110 |
| 104.0 | 416 | - | 0.8118 |
| 105.0 | 420 | - | 0.8004 |
| 106.0 | 424 | - | 0.8012 |
| 107.0 | 428 | - | 0.8070 |
| 108.0 | 432 | - | 0.8088 |
| 109.0 | 436 | - | 0.8073 |
| 110.0 | 440 | - | 0.8084 |
| 111.0 | 444 | - | 0.8038 |
| 112.0 | 448 | - | 0.8115 |
| 113.0 | 452 | - | 0.8169 |
| 114.0 | 456 | - | 0.8145 |
| 115.0 | 460 | - | 0.8020 |
| 116.0 | 464 | - | 0.7984 |
| 117.0 | 468 | - | 0.8077 |
| 118.0 | 472 | - | 0.8174 |
| 119.0 | 476 | - | 0.8200 |
| 120.0 | 480 | - | 0.8080 |
| 121.0 | 484 | - | 0.8093 |
| 122.0 | 488 | - | 0.8216 |
| 123.0 | 492 | - | 0.8240 |
| 124.0 | 496 | - | 0.8097 |
| 125.0 | 500 | 1.3195 | 0.8115 |
| 126.0 | 504 | - | 0.8176 |
| 127.0 | 508 | - | 0.8099 |
| 128.0 | 512 | - | 0.7977 |
| 129.0 | 516 | - | 0.7985 |
| 130.0 | 520 | - | 0.8015 |
| 131.0 | 524 | - | 0.8078 |
| 132.0 | 528 | - | 0.7985 |
| 133.0 | 532 | - | 0.8029 |
| 134.0 | 536 | - | 0.8087 |
| 135.0 | 540 | - | 0.8031 |
| 136.0 | 544 | - | 0.7999 |
| 137.0 | 548 | - | 0.8107 |
| 138.0 | 552 | - | 0.8110 |
| 139.0 | 556 | - | 0.7980 |
| 140.0 | 560 | - | 0.7977 |
| 141.0 | 564 | - | 0.8034 |
| 142.0 | 568 | - | 0.8053 |
| 143.0 | 572 | - | 0.7996 |
| 144.0 | 576 | - | 0.8014 |
| 145.0 | 580 | - | 0.8137 |
| 146.0 | 584 | - | 0.8221 |
| 147.0 | 588 | - | 0.8144 |
| 148.0 | 592 | - | 0.8020 |
| 149.0 | 596 | - | 0.7987 |
| 150.0 | 600 | - | 0.8029 |
| 151.0 | 604 | - | 0.8024 |
| 152.0 | 608 | - | 0.8045 |
| 153.0 | 612 | - | 0.8061 |
| 154.0 | 616 | - | 0.8035 |
| 155.0 | 620 | - | 0.8018 |
| 156.0 | 624 | - | 0.7992 |
| 157.0 | 628 | - | 0.8053 |
| 158.0 | 632 | - | 0.8134 |
| 159.0 | 636 | - | 0.8173 |
| 160.0 | 640 | - | 0.8118 |
| 161.0 | 644 | - | 0.8144 |
| 162.0 | 648 | - | 0.8145 |
| 163.0 | 652 | - | 0.8105 |
| 164.0 | 656 | - | 0.8011 |
| 165.0 | 660 | - | 0.8073 |
| 166.0 | 664 | - | 0.8111 |
| 167.0 | 668 | - | 0.8139 |
| 168.0 | 672 | - | 0.8030 |
| 169.0 | 676 | - | 0.8035 |
| 170.0 | 680 | - | 0.7993 |
| 171.0 | 684 | - | 0.8023 |
| 172.0 | 688 | - | 0.8081 |
| 173.0 | 692 | - | 0.8097 |
| 174.0 | 696 | - | 0.8060 |
| 175.0 | 700 | - | 0.8063 |
| 176.0 | 704 | - | 0.8114 |
| 177.0 | 708 | - | 0.8087 |
| 178.0 | 712 | - | 0.8090 |
| 179.0 | 716 | - | 0.8094 |
| 180.0 | 720 | - | 0.8071 |
| 181.0 | 724 | - | 0.8077 |
| 182.0 | 728 | - | 0.8108 |
| 183.0 | 732 | - | 0.8160 |
| 184.0 | 736 | - | 0.8099 |
| 185.0 | 740 | - | 0.8052 |
| 186.0 | 744 | - | 0.8103 |
| 187.0 | 748 | - | 0.8115 |
| 188.0 | 752 | - | 0.8105 |
| 189.0 | 756 | - | 0.8057 |
| 190.0 | 760 | - | 0.8157 |
| 191.0 | 764 | - | 0.8096 |
| 192.0 | 768 | - | 0.7998 |
| 193.0 | 772 | - | 0.8080 |
| 194.0 | 776 | - | 0.8207 |
| 195.0 | 780 | - | 0.8136 |
| 196.0 | 784 | - | 0.8029 |
| 197.0 | 788 | - | 0.8009 |
| 198.0 | 792 | - | 0.8150 |
| 199.0 | 796 | - | 0.8173 |
| 200.0 | 800 | - | 0.8070 |
| 201.0 | 804 | - | 0.8075 |
| 202.0 | 808 | - | 0.8164 |
| 203.0 | 812 | - | 0.8148 |
| 204.0 | 816 | - | 0.8077 |
| 205.0 | 820 | - | 0.8116 |
| 206.0 | 824 | - | 0.8148 |
| 207.0 | 828 | - | 0.8141 |
| 208.0 | 832 | - | 0.8085 |
| 209.0 | 836 | - | 0.8066 |
| 210.0 | 840 | - | 0.8154 |
| 211.0 | 844 | - | 0.8168 |
| 212.0 | 848 | - | 0.8132 |
| 213.0 | 852 | - | 0.8139 |
| 214.0 | 856 | - | 0.8200 |
| 215.0 | 860 | - | 0.8203 |
| 216.0 | 864 | - | 0.8100 |
| 217.0 | 868 | - | 0.8084 |
| 218.0 | 872 | - | 0.8115 |
| 219.0 | 876 | - | 0.8126 |
| 220.0 | 880 | - | 0.8126 |
| 221.0 | 884 | - | 0.8079 |
| 222.0 | 888 | - | 0.8101 |
| 223.0 | 892 | - | 0.8136 |
| 224.0 | 896 | - | 0.8124 |
| 225.0 | 900 | - | 0.8180 |
| 226.0 | 904 | - | 0.8173 |
| 227.0 | 908 | - | 0.8110 |
| 228.0 | 912 | - | 0.7991 |
| 229.0 | 916 | - | 0.8009 |
| 230.0 | 920 | - | 0.8096 |
| 231.0 | 924 | - | 0.8153 |
| 232.0 | 928 | - | 0.8177 |
| 233.0 | 932 | - | 0.8107 |
| 234.0 | 936 | - | 0.8066 |
| 235.0 | 940 | - | 0.8067 |
| 236.0 | 944 | - | 0.8198 |
| 237.0 | 948 | - | 0.8175 |
| 238.0 | 952 | - | 0.8077 |
| 239.0 | 956 | - | 0.8099 |
| 240.0 | 960 | - | 0.8073 |
| 241.0 | 964 | - | 0.8117 |
| 242.0 | 968 | - | 0.8148 |
| 243.0 | 972 | - | 0.8144 |
| 244.0 | 976 | - | 0.8050 |
| 245.0 | 980 | - | 0.8046 |
| 246.0 | 984 | - | 0.8107 |
| 247.0 | 988 | - | 0.8114 |
| 248.0 | 992 | - | 0.8065 |
| 249.0 | 996 | - | 0.8071 |
| 250.0 | 1000 | 1.081 | 0.8105 |
| 251.0 | 1004 | - | 0.8142 |
| 252.0 | 1008 | - | 0.8123 |
| 253.0 | 1012 | - | 0.8123 |
| 254.0 | 1016 | - | 0.8104 |
| 255.0 | 1020 | - | 0.8168 |
| 256.0 | 1024 | - | 0.8171 |
| 257.0 | 1028 | - | 0.8188 |
| 258.0 | 1032 | - | 0.8210 |
| 259.0 | 1036 | - | 0.8221 |
| 260.0 | 1040 | - | 0.8156 |
| 261.0 | 1044 | - | 0.8118 |
| 262.0 | 1048 | - | 0.8078 |
| 263.0 | 1052 | - | 0.8108 |
| 264.0 | 1056 | - | 0.8121 |
| 265.0 | 1060 | - | 0.8146 |
| 266.0 | 1064 | - | 0.8116 |
| 267.0 | 1068 | - | 0.8149 |
| 268.0 | 1072 | - | 0.8122 |
| 269.0 | 1076 | - | 0.8125 |
| 270.0 | 1080 | - | 0.8114 |
| 271.0 | 1084 | - | 0.8139 |
| 272.0 | 1088 | - | 0.8240 |
| 273.0 | 1092 | - | 0.8240 |
| 274.0 | 1096 | - | 0.8196 |
| 275.0 | 1100 | - | 0.8233 |
| 276.0 | 1104 | - | 0.8228 |
| 277.0 | 1108 | - | 0.8165 |
| 278.0 | 1112 | - | 0.8183 |
| 279.0 | 1116 | - | 0.8217 |
| 280.0 | 1120 | - | 0.8166 |
| 281.0 | 1124 | - | 0.8106 |
| 282.0 | 1128 | - | 0.8117 |
| 283.0 | 1132 | - | 0.8152 |
| 284.0 | 1136 | - | 0.8222 |
| 285.0 | 1140 | - | 0.8230 |
| 286.0 | 1144 | - | 0.8123 |
| 287.0 | 1148 | - | 0.8080 |
| 288.0 | 1152 | - | 0.8125 |
| 289.0 | 1156 | - | 0.8192 |
| 290.0 | 1160 | - | 0.8267 |
| 291.0 | 1164 | - | 0.8232 |
| 292.0 | 1168 | - | 0.8086 |
| 293.0 | 1172 | - | 0.8081 |
| 294.0 | 1176 | - | 0.8215 |
| 295.0 | 1180 | - | 0.8211 |
| 296.0 | 1184 | - | 0.8147 |
| 297.0 | 1188 | - | 0.8107 |
| 298.0 | 1192 | - | 0.8123 |
| 299.0 | 1196 | - | 0.8113 |
| 300.0 | 1200 | - | 0.8161 |
| 301.0 | 1204 | - | 0.8161 |
| 302.0 | 1208 | - | 0.8181 |
| 303.0 | 1212 | - | 0.8167 |
| 304.0 | 1216 | - | 0.8167 |
| 305.0 | 1220 | - | 0.8257 |
| 306.0 | 1224 | - | 0.8269 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
sentence-transformers/LaBSE