Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 using the dataset : manupande21/msmarco_train_hard_negatives. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("manupande21/all-MiniLM-L6-v2-finetuned-triples_hard_negatives")
# Run inference
sentences = [
'viral meningitis contagious',
'meningitis is contagious prolonged close contact can spread the bacteria that cause meningitis the bacteria can be spread through kissing coughs and sneezes shared cutlery or sharing items like toothbrushes or cigarettes',
'Infectious refers to a disease involving a microorganism that can be transmitted from one person to another only by a specific kind of contact; venereal diseases are usually infectious. In nontechnical senses, contagious emphasizes the rapidity with which something spreads: Contagious laughter ran through the hall. Infectious suggests the pleasantly irresistible quality of something: Her infectious good humor made her a popular guest.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
test-evalTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.8426 |
sentence_0, sentence_1, and sentence_2| sentence_0 | sentence_1 | sentence_2 | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| sentence_0 | sentence_1 | sentence_2 |
|---|---|---|
how much of an ira distribution is taxable |
Instead of a distributionof $10,000, letâs say the IRA ownertakes out a distribution of $25,000.This pushes the MAGI above$44,000, making $17,000 of SocialSecurity taxable.However, you can control howmuch tax you pay on Social Securityin many instances by controllingyour IRA distribution strategy. |
And regardless, whatever earnings you have on your contributions won't be taxed until you withdraw that money many years later. For example, let's say you made $30,000 during the year, and you put $2,000 of it into an IRA. You would pay income tax on only $28,000. |
who is archer fate ubl |
Archer is a Shirou who is from a different universe and who faced different circumstances and got betrayed and jumped to the conclusion that Kiritsugu's ideals were nothing but bs and that world peace can never be achieved because conflict is a part of human nature. |
Archer's True Name is Gilgamesh, the great half-god, half-human king born from the union between the King of Uruk, Lugalbanda, and goddess Rimat-Ninsun. He ruled the Sumerian city-state of Uruk, the capital city of ancient Mesopotamia in B.C. era. |
what is comvault |
Commvault software is an enterprise data protection and information management suite built on a scalable, single platform and unifying code base.The product uses a common set of advanced capabilities related to the storage and access of data and are administered through one console application.ommvault software is an enterprise data protection and information management suite built on a scalable, single platform and unifying code base. |
Definition of commensurate. 1 1 : equal in measure or extent : coextensive lived a life commensurate with the early years of the republic. 2 2 : corresponding in size, extent, amount, or degree : proportionate was given a job commensurate with her abilities. 3 3 : commensurable 1. |
TripletLoss with these parameters:{
"distance_metric": "TripletDistanceMetric.EUCLIDEAN",
"triplet_margin": 5
}
eval_strategy: stepsper_device_train_batch_size: 1024per_device_eval_batch_size: 1024num_train_epochs: 5fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 1024per_device_eval_batch_size: 1024per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | test-eval_cosine_accuracy |
|---|---|---|---|
| 1.0 | 393 | - | 0.8191 |
| 1.2723 | 500 | 3.0326 | - |
| 2.0 | 786 | - | 0.8353 |
| 2.5445 | 1000 | 2.4038 | 0.8376 |
| 3.0 | 1179 | - | 0.8426 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Base model
sentence-transformers/all-MiniLM-L6-v2