SentenceTransformer based on hkunlp/instructor-xl

This is a sentence-transformers model finetuned from hkunlp/instructor-xl. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: hkunlp/instructor-xl
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'T5EncoderModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': False})
  (2): Dense({'in_features': 1024, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ahmedHamdi/narrative-similarity-fr-en-masked-instructor-xl")
# Run inference
sentences = [
    "Represent the plot: In GPE, a mysterious organization, plotting to establish fascism in LOC, takes the young son of General GPE's family hostage. The organization finds a formidable adversary in the American PERSON, GPE's son's personal trainer. From GPE to GPE, PERSON manages to thwart the organization's plans by obtaining a compromising list of its members, threatening to release it to the media if they do not free their hostage.",
    "Represent the plot: PERSON is an American writer who has recently retired from boxing. Now unemployed and broke in GPE, he encounters the wealthy widow of a French general. PERSON is attracted to GPE, and he to her, but she keeps him at arm's length. She also hires him to tutor her eight-year-old son PERSON. The real reason she wants PERSON is for protection. PERSON is led to believe that PERSON's husband was killed in the Algerian conflict, and he is troubled by PERSON's intense fear that PERSON will be kidnapped. He then discovers the family has ties to a fascist organization that plans to take over all of LOC. He takes on the shady psychiatrist ORG and mysterious family friend PERSON, both of whom frighten Anne whenever they are around. Reno is framed for his best friend's murder as he and PERSON become the targets of the ambitious and maniacal schemers who wish to rule LOC. Reno and PERSON are hunted around GPE while protecting PERSON from being abducted. The chase ends at the GPE in GPE, where PERSON and the villains engage in a showdown.",
    "Represent the plot:  Princess PERSON, nicknamed PERSON, is the second oldest daughter of PERSON in GPE and Princess Ludovika of GPE. She is a carefree, impulsive and nature-loving child. She is raised with her seven siblings at the family seat PERSON on the shores of LOC in GPE. She has a happy childhood free of constraints associated with her royal status. With her mother and her demure older sister PERSON (called Néné), 16-year-old PERSON travels from PERSON to the spa town of ORG in GPE. PERSON's sister, PERSON, is the mother of the young emperor PERSON I of GPE. Helene is called by ORG to meet the young emperor PERSON in the imperial villa so that the two might be immediately engaged. PERSON is unaware of the real reason for the journey and is forbidden by her aunt to participate in any social events due to her girlishly impetuous ways. PERSON spends her time fishing in the forest where by chance she meets PERSON. The emperor is unaware that the girl is his cousin PERSON. He takes a liking to her and invites her for an afternoon hunting trip in the LOC. They meet as arranged in the mountains where they talk and become acquainted. PERSON falls in love with him but does not reveal her true identity. During their trip, PERSON learns of the planned marriage between PERSON with her sister. The Emperor confesses that he envies the man who will marry PERSON and confesses that he feels no connection to GPE. Upon hearing his indirect declaration of love, PERSON becomes distraught due to her loyalty to GPE. She runs away from PERSON without any explanation. When PERSON returns to their residence, Néné reveals the reason for the trip to GPE: to become engaged with PERSON. Unexpectedly, a new guest, the Prince of Lippe, arrives and PERSON is invited by the Archduchess to act as his partner at the Emperor's birthday celebration.At his birthday party, PERSON is suddenly confronted by PERSON's appearance there with her mother and sister. He realises who PERSON is and tries to talk to her, openly confessing his love and asking her to marry him. ORG rejects PERSON in order not to betray her sister. He defies his mother's reservations and ORG's resistance and announces, to the surprise of his guests, his betrothal to PERSON. Néné is heartbroken and leaves the party crying. PERSON, in a state of shock, is forced to obey the Emperor's wishes. In PERSON, preparations for the wedding have started. PERSON is not excited about her impending marriage, as the hurt GPE has left for an indefinite period. For her sister's sake, PERSON attempts to break her engagement, however, Néné returns with a new suitor, PERSON, PERSON of Thurn and Taxis. The sisters reunite and Néné gives her blessing to PERSON for her marriage. For the wedding ceremony, PERSON travels with her family on the steamboat PERSON down the PERSON to GPE. People line the banks, waving flags and cheering their future ORG. As part of a grand procession, PERSON enters the city in a gilded carriage. The wedding takes place in ORG on April 24, 1854.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.4458,  0.0298],
#         [ 0.4458,  1.0000, -0.0352],
#         [ 0.0298, -0.0352,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 17,624 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 12 tokens
    • mean: 99.0 tokens
    • max: 128 tokens
    • min: 7 tokens
    • mean: 122.21 tokens
    • max: 128 tokens
  • Samples:
    sentence_0 sentence_1
    Represent the plot: PERSON (PERSON), a bookseller, and PERSON (PERSON), a florist, are two old friends living in GPE. Since they're having serious money problems, PERSON comes up with an idea: he decides to become a pimp... but PERSON's pimp! PERSON, after initially finding the idea completely stupid and refusing, is eventually persuaded... Represent the plot: Dr. PERSON (PERSON), a wealthy dermatologist, mentions to her patient PERSON (PERSON) that she and a woman friend, ORG), wish to experience a ménage à trois and asks if he knows a willing man. PERSON, whose used bookstore has failed as a business, convinces his friend and former employee PERSON (PERSON) to take the gig, as both are short of money. Soon, they build a thriving gigolo trade with PERSON as the pimp. PERSON lives with PERSON (PERSON) and her children, one of whom gets head lice. PERSON takes the boy to PERSON (PERSON), the attractive widow of a Hassidic rabbi, for treatment. PERSON tells PERSON, claiming he's a massage healer who can help her, before taking her to see him. Being too religiously observant to even shake hands with him, she nonetheless allows PERSON to massage her back. That touch, the first in two years since the passing of her husband, brings her to tears. Meanwhile, PERSON (Liev Schreiber), who works for ORG, a GPE, GPE neighborhood patr...
    Represent the plot: After a suicide bombing in the GPE, ORG is tasked with preventing a nuclear attack on American soil. This attack is orchestrated by Arab terrorists, who are prepared to do anything to liberate LOC from Judeo-Western domination. ORG embarks on a new adventure in GPE, GPE, and GPE to save GPE and the world, and to capture PERSON, the leader of the Lebanese-Palestinian terrorists. Before being captured, Kadal entrusts his brother with a mission: to plant an atomic bomb in GPE. ORG then travels to GPE to destroy the fortress Represent the plot: Terrorist leader PERSON has threatened to bomb GPE, GPE unless western influence is removed from LOC. In order to keep PERSON from detonating an atomic bomb in GPE, ORG, with new leaders PERSON and PERSON, are ordered to team up with a group of Russian Spetsnaz commandos and head to ORG, PERSON's hometown in the fictional country of GPE, so they can hunt for PERSON. PERSON sends PERSON to GPE, with orders to detonate the bomb live on TV, where PERSON arranges for an assault on TV news producer PERSON. Turning up in time to save her, PERSON chases off the attackers and uses the situation to become close to PERSON. With friction between PERSON and PERSON growing, the mission doesn't start well, losing a Soviet team member almost immediately. Eventually they locate and apprehend PERSON, but at a cost, PERSON and PERSON are killed and PERSON is wounded. They learn PERSON has already sent his suicide bomber, PERSON and GPE interrogate PERSON and learn of the bombers iden...
    Represent the plot: Police Commissioner PERSON (PERSON) investigates the suspicious deaths of three high-ranking military officers who apparently committed suicide. Among them is General PERSON. Under pressure from his superiors, investigating magistrate PERSON (PERSON) orders ORG to quickly conclude the investigation. ORG then receives help from his girlfriend, journalist PERSON (PERSON), and an agent of the Italian secret service, PERSON (Tomás Milián), who also has doubts about the official version of the officers' suicides. In a country villa, GPE, a wealthy man, is murdered. Suspicion falls on an escort known as "La Tunisina," who was the last person to see ORG. Located and questioned by ORG, she proclaims her innocence. She claims she escaped because she saw the killer's face. The man will be identified as PERSON, nicknamed GPE. Unfortunately, the woman is later found and strangled by GPE. Commissioner PERSON orders his colleagues (Deputy Commissioner PERSON and Marshal De Luca) ... Represent the plot: Three army officials are murdered but made to look as if they committed suicide. After wealthy master electrician ORG is found dead, three policemen—Inspector PERSON, Lt. PERSON and Office De Luca—investigate. PERSON and PERSON visit retired madam PERSON, who has connections to ORG, and when they discover she's running a brothel out of her estate, they offer her immunity for information. They go to interrogate ORG's last visitor, ORG—AKA la GPE—only to find she's attempted suicide. At the hospital, they tell her she's the prime suspect and she admits guilt. ORG grows suspicious of her admission and suspects a blackmail plot, and she eventually admits she saw the true killer. PERSON and PERSON keep watch over ORG's villa overnight and witness a man break in and attempt to steal tape recordings. The man, PERSON, is taken into custody and ORG listens to the recordings, which contain a conversation between the murdered General PERSON and a lawyer named GPE, in which ...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.2270 500 0.1981
0.4539 1000 0.084
0.6809 1500 0.0751
0.9079 2000 0.0643
1.1348 2500 0.048
1.3618 3000 0.0389
1.5887 3500 0.0336
1.8157 4000 0.0366
2.0427 4500 0.0302
2.2696 5000 0.0131
2.4966 5500 0.0157
2.7236 6000 0.0145
2.9505 6500 0.0156

Framework Versions

  • Python: 3.9.18
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.6
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.5.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
1
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ahmedHamdi/narrative-similarity-fr-en-masked-instructor-xl

Finetuned
(16)
this model

Papers for ahmedHamdi/narrative-similarity-fr-en-masked-instructor-xl