SentenceTransformer based on PaDaS-Lab/xlm-roberta-base-msmarco

This is a sentence-transformers model finetuned from PaDaS-Lab/xlm-roberta-base-msmarco. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: PaDaS-Lab/xlm-roberta-base-msmarco
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Quando rivolgersi all’ortopedico?',
    'È opportuno avvalersi di un consulto specialistico da parte di un medico ortopedico nel caso in cui il paziente abbia subito lesioni traumatiche come quelle sopra menzionate oppure manifesti sintomi quali dolore locale e difficoltà motorie e articolatorie a carico degli arti o delle strutture muscolari.',
    'In caso di epicondilite si raccomanda di rivolgersi ad uno specialista (ad esempio un ortopedico); il punto cardine del trattamento del gomito del tennista è in ogni caso la sospensione del movimento che causa dolore; impacchi di ghiaccio ed antinfiammatori possono ridurre il fastidio e il medico specialista può inoltre suggerire il ricorso a tutori specifici che allevino la tensione sul tendine. Anche un approccio fisioterapico può sicuramente aiutare, ma se queste terapie (od altre più avanzate come ) non dovessero funzionare è possibile valutare come ultima risorsa una gestione chirurgica (non prima di 6-12 mesi di trattamento tradizionale).',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6583, 0.3812],
#         [0.6583, 1.0000, 0.5606],
#         [0.3812, 0.5606, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,275,279 training samples
  • Columns: sentence_0, sentence_1, sentence_2, sentence_3, sentence_4, and sentence_5
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2 sentence_3 sentence_4 sentence_5
    type string string string string string string
    details
    • min: 5 tokens
    • mean: 15.57 tokens
    • max: 117 tokens
    • min: 11 tokens
    • mean: 72.16 tokens
    • max: 512 tokens
    • min: 11 tokens
    • mean: 98.51 tokens
    • max: 512 tokens
    • min: 11 tokens
    • mean: 104.17 tokens
    • max: 512 tokens
    • min: 12 tokens
    • mean: 99.14 tokens
    • max: 512 tokens
    • min: 10 tokens
    • mean: 98.24 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2 sentence_3 sentence_4 sentence_5
    What kind of careers could I pursue? The sky is pretty much the limit! Earning a degree from AUM helps provide a strong foundation for a career in business, healthcare, science, fine arts, nursing and a lot more. Many of our 32,000 graduates are nurses, teachers, theatre professionals, historians, business executives, economists, IT managers, kinesiologists–in addition to many other job titles. If you’re already passionate about a specific career direction, great! We can help you determine the College and the major that will help you get there. However, if you do not know what career field is right for you, we can help with valuable resources like the Career Development Center. Our career development specialists will help you assess your natural skills and interests, explore your academic options, and assist you in developing an implementation plan. This way, on graduation day you’ll be ready to hit the job market and jumpstart your career! There is a greater demand for CFP certificate holders in the BFSI domain and they would have an upper hand over others. A CFP professional could look at working with:     Banks     Wealth managements firms and distribution houses     Mutual fund and insurance companies     Boutique financial planning firms     Financial planning software firms The certification would open up the doors of entrepreneurship. Aspiring individuals could look at starting on her own and pursue a career as a practicing financial planner. Primarily, Developmental Disability Nurses work in any role that brings them in contact with patients. This could be in a hospital, a clinic, in a group home, in the community, or in any other institutional environment. Some DD nurses also pursue careers in teaching, medical administration, and some policy work. After graduating, you could have the opportunity to play volleyball professionally, either domestically or internationally. You could also pursue coaching or related athletic professions. Additionally, the degree you earn will open doors to careers in your chosen field of study. A bachelor’s degree in interior design opens many doors. It will open the doors to a wide variety of opportunities. In addition to working in the field of residential design, you can pursue a career in commercial interior design, facility management, set design, showroom design, and sales. No matter what your interests are, you’ll find a career that suits you and your interests. Here are some of the different kinds of jobs you can get with an interiors degree. In addition to being a talented designer, you’ll be in high demand. A large number of businesses and restaurants seek a fresh look that will attract new customers. Individuals hire interior designers to help them remodel or flip existing spaces, or to create their dream home. The field of interior design is very diverse, and you’ll need a thorough knowledge of color theory, textiles, and other design elements to be successful.
    ¿Cuáles son las actividades que se realizan en una excursión? Puedes saber qué incluye una excursión haciendo clic en una actividad. Entonces verás una descripción completa de esta bajo el título ‘Experiencia’. En especial, encontrarás información sobre lo que está incluido y no incluido en el precio de la excursión. Por ejemplo, transporte, refrescos o guía.
    La duración de la actividad aparece en la sección ‘Sobre esta experiencia’. Para conocer las distintas horas de salida antes de reservar, tendrás que introducir la siguiente información:
    Número de participantes
    Fecha
    Idioma
    Tras la reserva, encontrarás la duración y la hora de salida en el bono de confirmación.
    Excursiones y actividades cercanas: Punta Cana le ofrece innumerables planes para combinar el ambiente relajante de los hoteles todo incluido con unas vacaciones más activas que consisten en excursiones, actividades y entretenidas atracciones. Hay varios parques naturales para los amantes de la naturaleza como Los Haitises (cerca de Samaná), e islas semidesiertas como la Isla Saona y la Isla Catalina (cerca de La Romana). Además, si te apetece descubrir la mezcla de la cultura nativa, te encantarán los Altos de Chavón en La Romana y, por supuesto, la ciudad de Santo Domingo. Sí, ofrecemos una variedad de actividades extracurriculares y oportunidades de inmersión cultural diseñadas específicamente para personas mayores. Estas actividades pueden incluir excursiones a museos, eventos culturales, talleres o clases de tango, lo que brinda a los estudiantes la oportunidad de sumergirse en la cultura argentina mientras mejoran sus habilidades lingüísticas. Las actividades que se realizan en Auckland son excursiones, visitar las playas, museos, actividad deportiva, pasear por los parques, entre otras, todas llenas de mucha diversión y facilitan la práctica del idioma inglés. Las excursiones incluye el transporte de ida y vuelta y la recogida en tu hotel de Mallorca. Dependiendo del proveedor algunas solo pasan por determinadas zonas de a isla a la hora de recoger a la personas. Te recomendamos que antes hacer la reserva te informes bien de esto.
    Además del transporte las excursiones incluyen también la entrada a las cuevas con paseo en baro por el Lago Martel y otras actividades por Mallorca como un tour por Porto Cristo, un tour a la fábrica de perlas Majorica o la entrada a las cuevas dels Hams.
    Why is chicken used for coward? The belief is that hens were used in this way because they were characterized as timid while roosters were portrayed as brave. Powerful leaders and other important men were sometimes referred to as “cocks” (in a good way) in the mid-16th century and hens were compared to them as weak. If someone calls you a chicken, they mean that you are a coward or afraid to do something. Definitely! You can to use chicken thighs, chicken breasts as well ss chicken drumsticks here. Just keep a note that if you are using chicken breasts, do not use small cut pieces, keep large chunks and cook. You can opt to use boneless chicken in this recipe. However, chicken on bone gives the best flavor, so I would recommend that. Sure, why not! It won't be the same, but it will be delicious. Shred up the cooked chicken and toss it with the marinade. Let it sit for 15 minutes before lightly sautéeing it just to warm through. Chickens are not considered prey of rats so they are safe from predation. However, rats will be attracted to chicken feed. That is why it is so important to keep the area clean, not only of poop but excess food on the ground.
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 32,
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • num_train_epochs: 1
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.0502 500 1.601
0.1004 1000 0.5685
0.1505 1500 0.5108
0.2007 2000 0.4857
0.2509 2500 0.4684
0.3011 3000 0.4544
0.3513 3500 0.4383
0.4014 4000 0.4292
0.4516 4500 0.4209
0.5018 5000 0.4125
0.5520 5500 0.4097
0.6022 6000 0.4045
0.6523 6500 0.4039
0.7025 7000 0.3965
0.7527 7500 0.3946
0.8029 8000 0.3912
0.8531 8500 0.3908
0.9033 9000 0.3872
0.9534 9500 0.3849

Framework Versions

  • Python: 3.10.4
  • Sentence Transformers: 5.2.0
  • Transformers: 4.57.3
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.12.0
  • Datasets: 2.21.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
1
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for IrvinTopi/mnrl-hardnegatives-denoised14

Finetuned
(4)
this model

Papers for IrvinTopi/mnrl-hardnegatives-denoised14