Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from hkunlp/instructor-xl. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'T5EncoderModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': False})
(2): Dense({'in_features': 1024, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ahmedHamdi/ir-fr-en-masked-instructor-xl")
# Run inference
sentences = [
'Represent the plot: Lord ORG, PERSON, PERSON, Lieutenant PERSON, and Princess PERSON have returned victorious from the tournament. According to the tournament rules, LOC is saved for a generation thanks to their victory. But ORG, PERSON, decides to break the rules by immediately invading LOC, thus opening dimensional portals that allow LOC to merge with GPE. Raiden and his friends have only six days to defeat PERSON and his warriors, and thus prevent the merger with Outworld...',
"Represent the plot: The Outworld emperor PERSON opens a portal to PERSON and has resurrected PERSON, Princess PERSON's long-deceased mother, to facilitate his invasion. Thunder god GPE and PERSON warriors PERSON, PERSON, and PERSON try to defend themselves, but PERSON kills Cage. The PERSON warriors retreat to seek allies. PERSON enlists the help of her ORG partner, PERSON, while GPE and PERSON search for a Native American shaman named PERSON, who seemingly knows the key to defeating PERSON. Scorpion appears and kidnaps GPE. Rayden meets with the Elder Gods and asks them why PERSON was allowed to break the tournament rules and force his way into PERSON, and how he can be stopped. One says that reuniting GPE with her mother, ORG, is the key to breaking PERSON's hold on PERSON, but another Elder God insists that the defeat of PERSON himself is the solution. GPE is then asked by the Elder Gods about his feelings and obligations towards the mortals, and what he would be willing to do to ensure their survival. PERSON finds PERSON, who teaches him about the power of the ORG, a form of shapeshifting which utilizes the caster's strengths and abilities. To achieve the mindset needed to acquire this power, PERSON must pass three tests. The first is a trial of his self-esteem, courage, and focus. The second comes in the form of temptation, which manifests itself in the form of Jade, a mysterious warrior who attempts to PERSON and offers her assistance after he resists her advances. PERSON accepts PERSON's offer and takes her with him to the Elder Gods' temple, where he and his friends meet with GPE. The third test is never revealed. The PERSON warriors learn that GPE has sacrificed his immortality to freely fight alongside them. Together, they infiltrate Outworld to rescue GPE and reunite her with ORG in hopes of restoring her soul and closing the Outworld portal to LOC. PERSON rescues PERSON while the others incapacitate ORG. However, ORG remains under PERSON's control and escapes during an ambush. Jade reveals herself to be a double agent sent by PERSON to disrupt the heroes' plans. PERSON feeds Jade to a gargoyle for her failure. Rayden reveals that PERSON is his brother and that the former Elder God GPE is their father. He realizes that GPE is supporting PERSON. Rayden and the PERSON warriors make their way to PERSON, ORG, and his remaining generals. Shinnok demands that GPE submit to him and restore their broken family, at the expense of his mortal friends. GPE refuses and is killed by an energy blast from PERSON. Jax, GPE, and GPE emerge victorious over ORG's generals (with Jax defeating PERSON, GPE defeating her mother, and GPE defeating Ermac). PERSON struggles with PERSON. PERSON's Animality proves effective, exposing a cut to PERSON that proves he is now mortal, as a consequence of his breaking the sacred rules. Shinnok attempts to intervene and kill PERSON on PERSON's behalf, but two of the Elder Gods arrive, having uncovered GPE's treachery. They declare that the fate of LOC shall be decided in PERSON. PERSON defeats PERSON, and GPE is banished to the PERSON. Earthrealm reverts to its former state. With PERSON's hold over ORG broken, she reunites with GPE. GPE is revived by the Elder Gods, who bestow upon him his father's former position. Before departing to the immortal realm, he enjoins the PERSON warriors to be there for one another. The PERSON warriors return home.",
"Represent the plot: 12 year-old Carol and her mother PERSON visit their family's hometown in GPE, during the Civil War in 1938. It is ORG's first time in the country, as she grew up in GPE in GPE. Her American father, PERSON, is fighting in the frontlines as a pilot with ORG. PERSON keeps in touch with her husband by writing letters, which are carried to the frontlines by a Portuguese smuggler. PERSON's family is conservative and middle-class; her and ORG's liberal American manners bring culture shock to the community, especially to the Catholic clergy. In a visit to her former teacher and best friend PERSON, PERSON reveals that she is seriously ill, and that she in fact came home to die. After her mother passes away, PERSON asks her grandfather, PERSON, to keep it secret from her father so as not to add to his worries. She also convinces PERSON to write letters to PERSON in her mother's name. Carol goes to live with her aunt Dolores and cousin Blanca; she befriends three local boys, including ORG, with whom she is attracted romantically. After GPE falls and the Republican faction is defeated in the war, PERSON, who is the only Republican sympathizer in a family supportive of General PERSON, is forced to burn his pro-Republican books. PERSON sneaks home, and PERSON is overjoyed to see her father again. The local authorities immediately search PERSON house for the fugitive. In the pursuit, ORG, whom PERSON wanted to introduce to her father, is accidentally shot and killed. In the epilogue, PERSON returns to GPE to her paternal grandparents' care. PERSON expresses hope that PERSON's father, who has been taken prisoner, would suffer only a few months in jail at worst, being a citizen of the influential GPE. In the car ride on the way to the port, ORG's surviving friends catch up on their bikes to say farewell; she imagines Tomiche with them, saying goodbye as well.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.7034, -0.1149],
# [ 0.7034, 1.0000, -0.0322],
# [-0.1149, -0.0322, 1.0000]])
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
Represent the plot: This is the story of PERSON, the legendary PERSON master and future mentor of PERSON, in GPE during the 1930s and 40s, and up to the early 1950s, when he began teaching his art in GPE. Devastated by the Japanese invasion, the country was going through a period of chaos, which nevertheless coincided with the golden age of Chinese martial arts. |
Represent the plot: PERSON grandmaster PERSON (PERSON) reflects on the nature of martial arts as he battles a dozen combatants during a rainstorm in GPE. Ip wins and experiences flashbacks of his life, from his early training at the age of seven, to his induction into PERSON by his master, PERSON (PERSON), and his marriage to PERSON (PERSON). Ip Man's peaceful existence is threatened by the arrival of ORG (PERSON), the PERSON martial arts grandmaster from northern GPE, who announces that he has already retired and appointed PERSON (PERSON) as his heir in the LOC. He then concedes that the LOC should have its own heir. A fight erupts as various masters attempt to challenge Gong, but they are deterred by PERSON. As the Southern masters deliberate on a representative, PERSON daughter ORG (PERSON) arrives and tries to convince her father not to continue the fight, as she feels the Southern masters are unworthy. Meanwhile, the Southern masters decide on ORG to represent them, and Ip procee... |
Represent the plot: PERSON is arrested after murdering her husband and his mistress. Pregnant, she is sentenced to life imprisonment, with only the minister able to decide on her release. Years later, PERSON is the pastor in the small parish of PERSON, a tiny English village where everyone knows everyone else, especially the pastor's family. He is busy with the incessant demands of his parishioners and preparing a sermon for his fellow church members. So busy, in fact, that he doesn't notice that his wife, PERSON, is ready to leave with ORG, her American golf instructor, because of his absence; that his son is being bullied by his peers; and that his daughter is frequently changing boyfriends. PERSON, now elderly, is hired as a housekeeper under the name PERSON, without the family knowing her past. Noticing that the neighbor's dog barks constantly, disturbing the family, PERSON kills it. The neighbor, discovering his dog's fate, unwittingly provokes ORG into killing it as well. PERSON ... |
Represent the plot: When a young pregnant woman named PERSON (PERSON) boards a train, her enormous trunk starts leaking blood in the luggage compartment. Questioned by the police about the dead bodies inside, PERSON calmly reveals they are her unfaithful husband and his mistress. Convicted of manslaughter, she is imprisoned in a unit for the criminally insane due to diminished responsibility. Forty-three years later, PERSON (PERSON), the village vicar of PERSON, is very busy writing the perfect sermon for a convention. He's completely oblivious to his family's problems: his wife, PERSON (PERSON), has unfulfilled emotional/sexual needs and starts an affair with her golf instructor, PERSON (PERSON); his daughter, PERSON (Tamsin Egerton), has a growing sex drive and physical maturity and constantly changes boyfriends without any reason; and his son, ORG (PERSON), has been a victim of bullying at school for quite some time. New housekeeper, PERSON (PERSON), becomes involved in their lives,... |
Represent the plot: During ORG, GPE is a ORG prisoner-of-war camp. Its completely isolated location, in a region populated by many Native Americans, makes escape impossible. To deter any attempt to flee, Captain PERSON enforces an iron discipline and severely punishes those who venture outside the camp. But this strict military approach displeases many of his fellow soldiers. A group of prisoners, however, is preparing an escape plan to return to GPE, with the help of a woman newly arrived at the fort, PERSON. |
Represent the plot: GPE is a Union prison camp with a strict disciplinarian named Captain PERSON (PERSON). A pretty woman named PERSON (PERSON) shows up to help with the wedding of her friend, but has really come to assist in freeing some prisoners including her previous beau ORG Captain PERSON (PERSON). Roper falls in love with her (and she with him) and the escape happens after the wedding celebrations and PERSON unexpectedly leaves with the four ORG escapees. This gives PERSON an additional motive to recapture the escapees. He does just that, but on the way back to the fort, they are attacked by fierce PERSON who are hostile to both sides and the group ends up trapped in a shallow exposed depression. Roper frees and arms his prisoners, but even then, it looks like the Apaches will wipe them out. PERSON (PERSON), a proven coward, escapes when one of their loose horses returns in the night. One by one, the rest of the group are killed, including ORG (PERSON), PERSON (PERSON), and the ... |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.2458 | 500 | 0.1774 |
| 0.4916 | 1000 | 0.0812 |
| 0.7375 | 1500 | 0.0781 |
| 0.9833 | 2000 | 0.0632 |
| 1.2291 | 2500 | 0.0332 |
| 1.4749 | 3000 | 0.0352 |
| 1.7207 | 3500 | 0.0288 |
| 1.9666 | 4000 | 0.0326 |
| 2.2124 | 4500 | 0.0164 |
| 2.4582 | 5000 | 0.0134 |
| 2.7040 | 5500 | 0.0133 |
| 2.9499 | 6000 | 0.0111 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
hkunlp/instructor-xl