SentenceTransformer based on hkunlp/instructor-xl

This is a sentence-transformers model finetuned from hkunlp/instructor-xl. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: hkunlp/instructor-xl
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'T5EncoderModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': False})
  (2): Dense({'in_features': 1024, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ahmedHamdi/ir-pt-en-masked-instructor-xl")
# Run inference
sentences = [
    "Represent the plot: In 1900, during the Belle Époque, playwright ORG is staying at ORG in GPE, a popular meeting place for romantic encounters. He needs to write a new play but is suffering from writer's block. He decides to observe the other guests and spots the couple (in a clandestine meeting): Mr. PERSON, married to a domineering wife, and PERSON, the beautiful and neglected wife of ORG, a building inspector. To the couple's surprise, ORG goes to the hotel to investigate rumors of ghosts (which he believes are noises caused by leaks in the plumbing). Also unexpectedly arriving are ORG's nephew and PERSON's maid, PERSON, as well as the Bonifaces' friend, ORG, with his four daughters. PERSON and PERSON try to hide from all the known guests but are thwarted by the inept employee PERSON and the arrival of the police.",
    "Represent the plot: ORG is staying in ORG. He needs to write a new play, but has writer's block. He takes the opportunity to observe his fellow guests: PERSON, henpecked by his domineering wife, and PERSON, the beautiful but neglected wife of ORG, a building inspector. ORG is sent to the hotel to investigate rumours of ghosts (which turn out to be caused by drains). However, the hotel is the trysting place of GPE and PERSON, who are having an affair. In the 'by-the-hour' hotel, there are two husbands and one wife, plus ORG's nephew and PERSON's maid, who are also having an affair. PERSON and PERSON's affair is severely compromised (not least by a police raid). All these events provide ORG with the material for his play, which becomes the succès fou of the next season.",
    "Represent the plot: Thanks to falsified dental records supplied by his former neighbor PERSON, retired hitman PERSON The Tulip Tudeski spends his days compulsively cleaning his house and perfecting his culinary skills with his wife, PERSON, a purported assassin who has yet to pull off a clean hiteveryone she is hired to kill dies in bizarre accidents before she has a chance. Oz now owns a dental practice in GPE and has married PERSON's ex-wife PERSON and expecting their first child, but the relationship is strained by ORG's excessive paranoia as well as PERSON's secret continued contact with GPE. Their lives are further complicated by the return of PERSON, PERSON's former mob boss and father figure, whose son PERSON was killed by PERSON and PERSON while PERSON was in prison. Having deduced that PERSON faked his death, Laszlo abducts PERSON and threatens Oz to try to learn PERSON's location, but Oz escapes. A desperate Oz contacts PERSON and PERSON, but PERSON refuses to help until PERSON's men attack, having followed Oz to PERSON. PERSON remaining son PERSON, PERSON offers to trade ORG. Oz triggers further conflict between PERSON and PERSON when he reveals GPE still wears a crucifix from PERSON. At a bar, PERSON becomes increasingly depressed at his failure to father a child with PERSON, and aggravates Oz by discussing his and PERSON's old sex life, culminating in Oz and PERSON becoming so drunk they wake up in the same bed. Frustrated by her poor sex life with PERSON, PERSON attempts to seduce Oz but is interrupted by PERSON, who knocks Oz out and regains his passion for PERSON and his work as the two have sex in the bathroom. Re-arming themselves at Oz's house, the three are attacked by an unknown marksman and ORG is killed in the crossfire. PERSON insults PERSON’s capabilities and coldly dismisses Oz; PERSON leaves. Oz retreats to his practice where he is met by PERSON, who apologizes for recent events. The two are chloroformed by ORG's new receptionist PERSON, revealed to be the sister of PERSON, seeking revenge for Oz and PERSON's role in her brother's death. Waking up beside PERSON and PERSON in PERSON's apartment, Oz is shocked to learn that the entire situation has been planned by PERSON and PERSON to find PERSON's half of the first dollar he ever stole, which he had torn and divided between PERSON and PERSON as kids. As Laszlo prepares to kill them, PERSON arrives, having tied up ORG's body in her car with explosives to appear alive, and threatens to detonate unless Laszlo releases Oz and PERSON. Asking to join PERSON's organization, PERSON is ordered to kill PERSON, who tells her she'll never be a successful hitter before she shoots him in the heart. PERSON's car explodes as PERSON's men try to release GPE, revealing PERSON was in on the plan and shot Jimmy with blanks. PERSON is exposed as the shooter who killed PERSON, and PERSON shoots her. PERSON, unable to kill the man who raised him, has PERSON shoot PERSON in the foot. PERSON and PERSON further reveal that PERSON's half of the dollar combines with PERSON to reveal the number for a $280 million bank account. PERSON reveals she is pregnant, and the four drive away as PERSON is arrested.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7329, 0.0920],
#         [0.7329, 1.0000, 0.0465],
#         [0.0920, 0.0465, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 5,800 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 19 tokens
    • mean: 99.6 tokens
    • max: 128 tokens
    • min: 20 tokens
    • mean: 124.71 tokens
    • max: 128 tokens
  • Samples:
    sentence_0 sentence_1
    Represent the plot: PERSON is a saxophonist and part-time law student who, in 1930, travels with his financially struggling band to GPE. They make a quick stop in the small town of PERSON, where PERSON sees the beautiful bank teller ORG and decides to leave the band and get a job as a clerk in the store owned by the elderly Mr. PERSON. The wealthy, benevolent elderly woman, PERSON, is concerned about the idle local youth, so PERSON offers to be the leader of a PERSON troop that would bring together all the boys in town. Only the rebellious PERSON hesitates to join the troop, as he is ashamed of his drunken father. Lem goes ahead with the project, despite all the difficulties. Represent the plot: In 1930, PERSON, a saxophonist in a traveling band, dreams of becoming a lawyer. When the band's bus reaches the small town of ORG suddenly decides to leave the band and settle down, finding a job as a clerk in the general store owned by PERSON. At the town civic meeting, PERSON again notices PERSON, a bank teller whom PERSON had seen on his first day in town, and eventually attempts to woo away from her boyfriend PERSON. Lem notices ORG crosses off the YMCA and the 4-H from her list of three possible organizations to keep the town's boys off the streets, leaving only PERSON, and he decides to suggest and volunteer to become PERSON of the newly formed ORG 1. Some time later, PERSON becomes an all-around natural leader with the PERSON troop, even putting a plan to become a lawyer aside as he helps the town's boys mature into men. Meanwhile, the town's troublemaker boy, PERSON, refuses to join the troop. One night, while PERSON and PERSON are on a date, they catch PER...
    Represent the plot: A baby from a wealthy family is kidnapped by three bumbling criminals disguised as photographers. The baby, however, manages to escape through the streets of GPE, involving his pursuers in various mishaps. Represent the plot: PERSON, the infant son of socialites GPE and PERSON, lives in a mansion in a suburb of GPE and is just about to appear in the social pages of the newspaper. Three very clumsy criminals, PERSON, PERSON, and PERSON, disguise themselves as baby photographers from the newspaper and kidnap him, demanding a $5 million ransom. After the kidnapping, however, the criminals have difficulty controlling PERSON at their apartment. ORG attempts to put him to sleep by reading his favorite storybook, Baby's Day Out (or ORG as he calls it), only to fall asleep himself from boredom, leaving PERSON unattended. Looking through the book, PERSON notices a bird on the page and then one by the window; he follows it out and successfully gets away from his kidnappers. The ensuing chase culminates in GPE falling off the building and into a garbage bin. PERSON and GPE rescue him, and they begin pursuing PERSON across the city. The ORG arrives at the mansion, headed by PERSON, where they try to...
    Represent the plot: Internationally recognized and holder of numerous triumphs in different sports, the Native American athlete PERSON has to confront prejudice and the revocation of his Olympic medals, as he was considered a professional athlete at a time when amateurism was mandatory for Olympic competitions. Represent the plot: During a banquet, legendary football coach Pop ORG rises and gives a speech praising PERSON, which leads to a flashback. Youngster PERSON runs all the way home before his first day at an Indian reservation school, but his father talks him into going back, telling him that he wants his son to make something of himself. Years later, a now-adult PERSON arrives on the campus of ORG to continue his education. He likes his roommates at the boarding school well enough, fast-talking PERSON and the huge PERSON Like Bear, but nearly gets into a fight with upperclassman and football star PERSON. When the academic pressure becomes too much for him, PERSON goes for a long run, during which he outraces some practicing track athletes. Witnessing this, coach Pop ORG talks PERSON into joining the track team. PERSON is so talented, versatile, and quick to learn that, at the next meet, Pop's team consists of just him (competing in all but the distance running events) and one other man...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.6897 500 0.1473
1.3793 1000 0.0467
2.0690 1500 0.032
2.7586 2000 0.0178

Framework Versions

  • Python: 3.9.21
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.6
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.5.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
3
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ahmedHamdi/ir-pt-en-masked-instructor-xl

Finetuned
(16)
this model

Papers for ahmedHamdi/ir-pt-en-masked-instructor-xl