SentenceTransformer based on FacebookAI/roberta-large

This is a sentence-transformers model finetuned from FacebookAI/roberta-large on the all-nli dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: FacebookAI/roberta-large
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'RobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'A construction worker peeking out of a manhole while his coworker sits on the sidewalk smiling.',
    'A worker is looking out of a manhole.',
    'The workers are both inside the manhole.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6658, 0.1268],
#         [0.6658, 1.0000, 0.2392],
#         [0.1268, 0.2392, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.8232 0.7945
spearman_cosine 0.8229 0.7977

Training Details

Training Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 557,850 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 10.38 tokens
    • max: 45 tokens
    • min: 6 tokens
    • mean: 12.8 tokens
    • max: 39 tokens
    • min: 6 tokens
    • mean: 13.4 tokens
    • max: 50 tokens
  • Samples:
    anchor positive negative
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse. A person is at a diner, ordering an omelette.
    Children smiling and waving at camera There are children present The kids are frowning
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick. The boy skates down the sidewalk.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 6,584 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 18.02 tokens
    • max: 66 tokens
    • min: 5 tokens
    • mean: 9.81 tokens
    • max: 29 tokens
    • min: 5 tokens
    • mean: 10.37 tokens
    • max: 29 tokens
  • Samples:
    anchor positive negative
    Two women are embracing while holding to go packages. Two woman are holding packages. The men are fighting outside a deli.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands. Two kids in jackets walk to school.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer. A woman drinks her coffee in a small cafe.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 15
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.5730 -
0.0287 500 13.1307 3.8711 0.8322 -
0.0574 1000 4.9118 2.2331 0.8644 -
0.0860 1500 3.9032 1.8578 0.8656 -
0.1147 2000 3.4689 1.6312 0.8714 -
0.1434 2500 3.1647 1.5546 0.8716 -
0.1721 3000 2.9685 1.4834 0.8801 -
0.2008 3500 2.8231 1.3738 0.8712 -
0.2294 4000 2.6908 1.3634 0.8718 -
0.2581 4500 2.6483 1.3831 0.8789 -
0.2868 5000 2.5545 1.3328 0.8756 -
0.3155 5500 2.428 1.2995 0.8737 -
0.3442 6000 2.4545 1.2703 0.8752 -
0.3729 6500 2.4895 1.2613 0.8704 -
0.4015 7000 2.406 1.2644 0.8632 -
0.4302 7500 2.2968 1.3179 0.8635 -
0.4589 8000 2.2072 1.3236 0.8724 -
0.4876 8500 2.3579 1.3357 0.8599 -
0.5163 9000 2.3808 1.2971 0.8592 -
0.5449 9500 2.2616 1.3413 0.8692 -
0.5736 10000 2.199 1.3170 0.8601 -
0.6023 10500 2.2254 1.3450 0.8590 -
0.6310 11000 2.16 1.3072 0.8606 -
0.6597 11500 2.1753 1.3070 0.8662 -
0.6883 12000 2.0891 1.2685 0.8687 -
0.7170 12500 2.1434 1.3496 0.8605 -
0.7457 13000 2.133 1.2944 0.8533 -
0.7744 13500 2.0775 1.3831 0.8528 -
0.8031 14000 2.0856 1.3325 0.8558 -
0.8318 14500 2.0905 1.3525 0.8713 -
0.8604 15000 2.0856 1.3079 0.8513 -
0.8891 15500 2.1206 1.3687 0.8509 -
0.9178 16000 2.0854 1.3752 0.8429 -
0.9465 16500 2.0765 1.4162 0.8423 -
0.9752 17000 2.0011 1.3374 0.8487 -
1.0038 17500 2.1728 1.4115 0.8504 -
1.0325 18000 1.8636 1.6978 0.8451 -
1.0612 18500 2.2661 1.3775 0.8501 -
1.0899 19000 1.9163 1.3913 0.8407 -
1.1186 19500 1.8524 1.3511 0.8495 -
1.1472 20000 1.9746 1.4419 0.8480 -
1.1759 20500 1.9949 1.4820 0.8406 -
1.2046 21000 2.0087 1.4877 0.8444 -
1.2333 21500 2.0073 1.3913 0.8475 -
1.2620 22000 1.9374 1.5215 0.8539 -
1.2907 22500 1.979 1.5082 0.8585 -
1.3193 23000 1.9629 1.4630 0.8454 -
1.3480 23500 2.0761 1.5292 0.8397 -
1.3767 24000 2.0052 1.5718 0.8464 -
1.4054 24500 2.0406 1.4504 0.8462 -
1.4341 25000 1.9931 1.6408 0.8470 -
1.4627 25500 2.0963 1.6715 0.8469 -
1.4914 26000 2.0744 1.7317 0.8344 -
1.5201 26500 2.0192 1.7196 0.8301 -
1.5488 27000 2.072 1.8939 0.8272 -
1.5775 27500 2.1233 1.6756 0.8381 -
1.6061 28000 2.1309 1.6725 0.8399 -
1.6348 28500 2.1307 1.7151 0.8337 -
1.6635 29000 2.0551 1.6822 0.8348 -
1.6922 29500 2.0623 1.6315 0.8463 -
1.7209 30000 2.1413 1.6249 0.8420 -
1.7496 30500 2.0578 1.8499 0.8327 -
1.7782 31000 2.1535 1.6687 0.8427 -
1.8069 31500 2.1894 1.6501 0.8369 -
1.8356 32000 1.979 1.7413 0.8341 -
1.8643 32500 2.1038 1.7312 0.8345 -
1.8930 33000 2.0746 1.7323 0.8313 -
1.9216 33500 2.2393 1.9724 0.8354 -
1.9503 34000 2.171 1.7513 0.8395 -
1.9790 34500 2.0399 1.7751 0.8345 -
2.0077 35000 2.0002 1.7431 0.8411 -
2.0364 35500 1.6843 1.7284 0.8365 -
2.0650 36000 1.772 1.8173 0.8394 -
2.0937 36500 1.7372 1.8277 0.8381 -
2.1224 37000 1.8665 1.7897 0.8392 -
2.1511 37500 1.8157 1.8601 0.8326 -
2.1798 38000 1.8641 1.6849 0.8388 -
2.2085 38500 1.7293 1.6760 0.8388 -
2.2371 39000 1.7038 1.6455 0.8348 -
2.2658 39500 1.8139 1.7665 0.8205 -
2.2945 40000 1.7791 1.7799 0.8302 -
2.3232 40500 2.8435 1.9061 0.8196 -
2.3519 41000 1.9696 1.8294 0.8344 -
2.3805 41500 1.9685 1.9805 0.8169 -
2.4092 42000 1.7893 1.8200 0.8283 -
2.4379 42500 1.74 1.7132 0.8366 -
2.4666 43000 1.7877 1.7723 0.8433 -
2.4953 43500 1.8317 1.6720 0.8367 -
2.5239 44000 1.7922 1.7199 0.8249 -
2.5526 44500 1.7841 1.7628 0.8300 -
2.5813 45000 1.8367 1.8752 0.8328 -
2.6100 45500 1.773 1.8062 0.8367 -
2.6387 46000 1.8124 1.8124 0.8347 -
2.6674 46500 1.7595 1.7697 0.8340 -
2.6960 47000 1.7422 1.8300 0.8231 -
2.7247 47500 1.8007 1.7629 0.8303 -
2.7534 48000 1.7744 1.7752 0.8287 -
2.7821 48500 1.6891 1.6854 0.8341 -
2.8108 49000 1.7044 1.8094 0.8213 -
2.8394 49500 1.6808 1.6874 0.8243 -
2.8681 50000 1.6774 1.7517 0.8229 -
2.8968 50500 1.684 1.7038 0.8314 -
2.9255 51000 1.7204 1.7657 0.8268 -
2.9542 51500 1.6877 1.7660 0.8306 -
2.9828 52000 1.8228 1.7883 0.8241 -
3.0115 52500 1.5882 1.7963 0.8278 -
3.0402 53000 1.4159 1.8232 0.8262 -
3.0689 53500 1.4347 1.8152 0.8246 -
3.0976 54000 1.5007 1.8113 0.8216 -
3.1263 54500 1.5196 1.7677 0.8250 -
3.1549 55000 1.4994 1.7585 0.8302 -
3.1836 55500 1.5854 1.7113 0.8310 -
3.2123 56000 1.4578 1.8058 0.8238 -
3.2410 56500 1.525 1.7659 0.8255 -
3.2697 57000 1.4602 1.8074 0.8262 -
3.2983 57500 2.1095 1.7176 0.8306 -
3.3270 58000 1.4814 1.8732 0.8312 -
3.3557 58500 1.6221 1.7636 0.8297 -
3.3844 59000 1.4695 1.7405 0.8232 -
3.4131 59500 1.5805 1.7804 0.8376 -
3.4417 60000 1.4774 1.7737 0.8289 -
3.4704 60500 1.4614 1.7717 0.8369 -
3.4991 61000 1.5027 1.8026 0.8312 -
3.5278 61500 1.4788 1.8565 0.8187 -
3.5565 62000 1.5613 1.8170 0.8268 -
3.5852 62500 1.529 1.9040 0.8255 -
3.6138 63000 1.5549 2.0451 0.8263 -
3.6425 63500 1.5604 1.8017 0.8323 -
3.6712 64000 1.4462 1.8130 0.8291 -
3.6999 64500 1.5074 1.8368 0.8236 -
3.7286 65000 1.4982 1.8025 0.8251 -
3.7572 65500 1.5496 1.7867 0.8235 -
3.7859 66000 1.5688 1.8279 0.8200 -
3.8146 66500 1.4988 1.8225 0.8315 -
3.8433 67000 1.5178 1.7781 0.8211 -
3.8720 67500 1.4558 1.8183 0.8279 -
3.9006 68000 1.52 1.7993 0.8219 -
3.9293 68500 1.4339 1.7622 0.8211 -
3.9580 69000 1.4377 1.8122 0.8131 -
3.9867 69500 1.5208 1.8728 0.8200 -
4.0154 70000 1.3749 1.7789 0.8285 -
4.0441 70500 1.3293 1.9063 0.8301 -
4.0727 71000 1.2638 2.1195 0.8174 -
4.1014 71500 1.454 1.8508 0.8265 -
4.1301 72000 1.3227 1.8074 0.8169 -
4.1588 72500 1.3982 1.9820 0.8127 -
4.1875 73000 1.3168 1.9202 0.8223 -
4.2161 73500 1.2791 1.7751 0.8279 -
4.2448 74000 1.2821 1.7690 0.8259 -
4.2735 74500 1.2799 1.8505 0.8202 -
4.3022 75000 1.2453 1.7584 0.8334 -
4.3309 75500 1.2421 1.7923 0.8297 -
4.3595 76000 1.332 1.8744 0.8216 -
4.3882 76500 1.3413 1.8049 0.8292 -
4.4169 77000 1.3342 1.7446 0.8271 -
4.4456 77500 1.2565 1.7859 0.8229 -
4.4743 78000 1.2976 2.0875 0.8284 -
4.5030 78500 1.2861 1.8081 0.8260 -
4.5316 79000 1.2982 1.7828 0.8267 -
4.5603 79500 1.3014 1.7792 0.8200 -
4.5890 80000 1.2867 1.8072 0.8251 -
4.6177 80500 1.3247 1.7776 0.8256 -
4.6464 81000 1.3646 1.7684 0.8210 -
4.6750 81500 1.4309 1.8437 0.8186 -
4.7037 82000 1.3742 1.9158 0.8157 -
4.7324 82500 4.1451 1.8141 0.8249 -
4.7611 83000 1.3416 1.7796 0.8277 -
4.7898 83500 1.3342 1.7990 0.8238 -
4.8184 84000 1.3027 1.9050 0.8180 -
4.8471 84500 1.3237 1.7734 0.8227 -
4.8758 85000 1.2319 1.8478 0.8243 -
4.9045 85500 1.279 1.7974 0.8255 -
4.9332 86000 1.2646 1.7283 0.8305 -
4.9619 86500 1.1886 1.9569 0.8212 -
4.9905 87000 1.2567 1.7428 0.8295 -
5.0192 87500 1.1228 1.8055 0.8306 -
5.0479 88000 1.0618 1.7539 0.8274 -
5.0766 88500 1.0226 1.8684 0.8298 -
5.1053 89000 1.0808 1.7666 0.8208 -
5.1339 89500 1.044 1.7659 0.8211 -
5.1626 90000 1.0438 1.7997 0.8260 -
5.1913 90500 1.1137 1.8361 0.8175 -
5.2200 91000 1.0646 1.8627 0.8264 -
5.2487 91500 1.0429 1.8203 0.8254 -
5.2773 92000 1.108 1.7993 0.8255 -
5.3060 92500 1.031 1.8521 0.8239 -
5.3347 93000 1.1251 1.8060 0.8244 -
5.3634 93500 1.1004 1.8634 0.8247 -
5.3921 94000 1.1488 1.8411 0.8288 -
5.4208 94500 1.0505 1.7374 0.8253 -
5.4494 95000 1.1083 1.8251 0.8238 -
5.4781 95500 1.1011 1.7409 0.8328 -
5.5068 96000 1.1082 1.6637 0.8354 -
5.5355 96500 1.105 1.8248 0.8252 -
5.5642 97000 1.1057 1.7803 0.8261 -
5.5928 97500 1.0842 1.8545 0.8195 -
5.6215 98000 1.0576 1.8561 0.8216 -
5.6502 98500 1.0779 1.7923 0.8272 -
5.6789 99000 1.0408 1.7872 0.8314 -
5.7076 99500 1.1003 1.7327 0.8320 -
5.7362 100000 1.123 1.8170 0.8191 -
5.7649 100500 1.0396 1.6897 0.8258 -
5.7936 101000 1.0426 1.7272 0.8331 -
5.8223 101500 1.0729 1.7506 0.8244 -
5.8510 102000 1.0641 1.7722 0.8306 -
5.8797 102500 1.0518 1.6785 0.8273 -
5.9083 103000 1.0955 1.9886 0.8288 -
5.9370 103500 1.21 2.0539 0.8261 -
5.9657 104000 3.942 1.7589 0.8341 -
5.9944 104500 1.1229 1.8700 0.8185 -
6.0231 105000 0.9885 1.8072 0.8278 -
6.0517 105500 0.9292 1.8001 0.8230 -
6.0804 106000 0.8982 1.8051 0.8317 -
6.1091 106500 0.8904 1.7529 0.8260 -
6.1378 107000 0.8534 1.7874 0.8190 -
6.1665 107500 0.9079 1.7166 0.8289 -
6.1951 108000 0.9005 1.7859 0.8209 -
6.2238 108500 0.9184 1.7757 0.8215 -
6.2525 109000 0.9333 1.7849 0.8261 -
6.2812 109500 0.9627 1.8212 0.8209 -
6.3099 110000 0.9174 1.7716 0.8239 -
6.3386 110500 0.9259 1.8290 0.8278 -
6.3672 111000 0.8882 1.7430 0.8272 -
6.3959 111500 0.8686 1.8061 0.8245 -
6.4246 112000 0.9222 1.8112 0.8221 -
6.4533 112500 0.9037 1.8119 0.8211 -
6.4820 113000 0.8855 1.8029 0.8118 -
6.5106 113500 0.9046 1.8553 0.8245 -
6.5393 114000 0.9272 1.7863 0.8176 -
6.5680 114500 0.931 1.8363 0.8161 -
6.5967 115000 1.0015 1.9976 0.8130 -
6.6254 115500 1.0549 1.8178 0.8212 -
6.6540 116000 0.9827 1.7530 0.8265 -
6.6827 116500 0.9652 1.8149 0.8206 -
6.7114 117000 0.9022 1.8423 0.8259 -
6.7401 117500 0.9249 1.7947 0.8176 -
6.7688 118000 0.8837 1.8191 0.8204 -
6.7975 118500 0.9227 1.7489 0.8259 -
6.8261 119000 0.925 1.7993 0.8160 -
6.8548 119500 0.9141 1.8146 0.8196 -
6.8835 120000 0.8956 1.7155 0.8253 -
6.9122 120500 0.8889 1.7959 0.8347 -
6.9409 121000 0.9551 1.7828 0.8286 -
6.9695 121500 0.8899 1.7918 0.8205 -
6.9982 122000 0.9091 1.7596 0.8236 -
7.0269 122500 0.7642 1.8320 0.8206 -
7.0556 123000 0.9046 1.9385 0.8222 -
7.0843 123500 0.766 1.9060 0.8259 -
7.1129 124000 0.789 1.8289 0.8264 -
7.1416 124500 0.7838 1.8628 0.8281 -
7.1703 125000 0.778 1.7788 0.8154 -
7.1990 125500 0.7838 1.8501 0.8157 -
7.2277 126000 0.8169 1.7758 0.8273 -
7.2564 126500 0.7384 1.9127 0.8194 -
7.2850 127000 0.7868 1.9334 0.8216 -
7.3137 127500 0.7678 1.7547 0.8274 -
7.3424 128000 0.7264 1.7854 0.8275 -
7.3711 128500 0.7925 1.8186 0.8318 -
7.3998 129000 0.7739 1.8462 0.8284 -
7.4284 129500 0.7604 1.8237 0.8277 -
7.4571 130000 0.7492 1.8005 0.8266 -
7.4858 130500 0.7634 1.8099 0.8191 -
7.5145 131000 0.724 1.7840 0.8328 -
7.5432 131500 0.7595 1.8725 0.8219 -
7.5718 132000 0.7308 1.8795 0.8154 -
7.6005 132500 0.7312 1.7966 0.8291 -
7.6292 133000 0.743 1.7927 0.8278 -
7.6579 133500 0.7586 1.7952 0.8294 -
7.6866 134000 0.8312 1.7386 0.8213 -
7.7153 134500 0.7633 1.7157 0.8260 -
7.7439 135000 0.7448 1.8266 0.8261 -
7.7726 135500 0.818 1.7873 0.8275 -
7.8013 136000 0.8235 1.7485 0.8198 -
7.8300 136500 0.7899 1.8871 0.8176 -
7.8587 137000 0.8828 1.9689 0.8184 -
7.8873 137500 0.7736 1.7805 0.8213 -
7.9160 138000 0.7228 1.8282 0.8248 -
7.9447 138500 0.7677 1.7306 0.8244 -
7.9734 139000 0.7351 1.8036 0.8220 -
8.0021 139500 0.7599 1.7727 0.8148 -
8.0307 140000 0.6264 1.7673 0.8177 -
8.0594 140500 0.6125 1.7771 0.8227 -
8.0881 141000 0.6353 1.7675 0.8195 -
8.1168 141500 0.6346 1.7946 0.8229 -
8.1455 142000 0.6101 1.7527 0.8280 -
8.1742 142500 0.5788 1.7372 0.8236 -
8.2028 143000 0.6028 1.7798 0.8248 -
8.2315 143500 0.649 1.7616 0.8198 -
8.2602 144000 0.6672 1.7052 0.8319 -
8.2889 144500 0.665 1.8043 0.8249 -
8.3176 145000 0.619 1.8087 0.8207 -
8.3462 145500 0.6151 1.7635 0.8305 -
8.3749 146000 0.6022 1.7403 0.8313 -
8.4036 146500 0.6258 1.7289 0.8250 -
8.4323 147000 0.6407 1.7225 0.8277 -
8.4610 147500 0.6372 1.7056 0.8284 -
8.4896 148000 0.6761 1.7248 0.8212 -
8.5183 148500 0.6568 1.7226 0.8265 -
8.5470 149000 0.6383 1.6703 0.8281 -
8.5757 149500 0.624 1.7020 0.8245 -
8.6044 150000 0.6188 1.7051 0.8293 -
8.6331 150500 0.6376 1.6799 0.8298 -
8.6617 151000 0.6795 1.7103 0.8247 -
8.6904 151500 0.6274 1.6895 0.8208 -
8.7191 152000 0.6165 1.7270 0.8241 -
8.7478 152500 0.6016 1.7310 0.8217 -
8.7765 153000 0.5853 1.7136 0.8252 -
8.8051 153500 0.666 1.7093 0.8288 -
8.8338 154000 0.61 1.7469 0.8250 -
8.8625 154500 0.6542 1.7309 0.8237 -
8.8912 155000 0.6038 1.6728 0.8213 -
8.9199 155500 0.6195 1.6677 0.8189 -
8.9485 156000 0.646 1.7323 0.8253 -
8.9772 156500 0.6538 1.6865 0.8238 -
9.0059 157000 0.592 1.7343 0.8209 -
9.0346 157500 0.5138 1.7442 0.8233 -
9.0633 158000 0.4933 1.7031 0.8232 -
9.0920 158500 0.4745 1.7306 0.8272 -
9.1206 159000 0.4669 1.7311 0.8289 -
9.1493 159500 0.5194 1.6786 0.8285 -
9.1780 160000 0.536 1.7298 0.8257 -
9.2067 160500 0.4942 1.7287 0.8260 -
9.2354 161000 0.5187 1.6976 0.8235 -
9.2640 161500 0.4831 1.6702 0.8305 -
9.2927 162000 0.5253 1.7145 0.8242 -
9.3214 162500 0.4667 1.6928 0.8245 -
9.3501 163000 0.5022 1.6803 0.8252 -
9.3788 163500 0.5203 1.7851 0.8236 -
9.4074 164000 0.4864 1.6996 0.8217 -
9.4361 164500 0.5125 1.7387 0.8176 -
9.4648 165000 0.4808 1.6818 0.8287 -
9.4935 165500 0.5257 1.7030 0.8255 -
9.5222 166000 0.4963 1.7088 0.8237 -
9.5509 166500 0.5304 1.6953 0.8275 -
9.5795 167000 0.5243 1.6535 0.8236 -
9.6082 167500 0.5012 1.6995 0.8259 -
9.6369 168000 0.5155 1.6797 0.8267 -
9.6656 168500 0.511 1.6843 0.8258 -
9.6943 169000 0.4822 1.6736 0.8308 -
9.7229 169500 0.4908 1.6450 0.8233 -
9.7516 170000 0.5098 1.6952 0.8243 -
9.7803 170500 0.5232 1.7315 0.8263 -
9.8090 171000 0.5174 1.7310 0.8273 -
9.8377 171500 0.5064 1.6783 0.8290 -
9.8663 172000 0.5096 1.7544 0.8248 -
9.8950 172500 0.4885 1.6620 0.8270 -
9.9237 173000 0.4612 1.6874 0.8210 -
9.9524 173500 0.5025 1.7113 0.8221 -
9.9811 174000 0.5071 1.7020 0.8237 -
10.0098 174500 0.4593 1.7157 0.8234 -
10.0384 175000 0.3894 1.7493 0.8260 -
10.0671 175500 0.3875 1.7702 0.8223 -
10.0958 176000 0.4322 1.8000 0.8227 -
10.1245 176500 0.4227 1.7576 0.8276 -
10.1532 177000 0.4368 1.7613 0.8285 -
10.1818 177500 0.4236 1.7270 0.8299 -
10.2105 178000 0.4149 1.7487 0.8298 -
10.2392 178500 0.4108 1.7098 0.8274 -
10.2679 179000 0.4116 1.6843 0.8284 -
10.2966 179500 0.3987 1.6513 0.8284 -
10.3252 180000 0.4667 1.7409 0.8261 -
10.3539 180500 0.4278 1.7262 0.8249 -
10.3826 181000 0.4427 1.7291 0.8250 -
10.4113 181500 0.4157 1.6731 0.8270 -
10.4400 182000 0.4301 1.6889 0.8266 -
10.4687 182500 0.3917 1.7171 0.8221 -
10.4973 183000 0.3984 1.6740 0.8204 -
10.5260 183500 0.3972 1.6973 0.8226 -
10.5547 184000 0.3958 1.7018 0.8276 -
10.5834 184500 0.4144 1.7134 0.8218 -
10.6121 185000 0.3967 1.7309 0.8196 -
10.6407 185500 0.43 1.6597 0.8273 -
10.6694 186000 0.409 1.7010 0.8212 -
10.6981 186500 0.4209 1.6872 0.8224 -
10.7268 187000 0.4165 1.6793 0.8230 -
10.7555 187500 0.3769 1.6417 0.8254 -
10.7841 188000 0.4232 1.6933 0.8187 -
10.8128 188500 0.3872 1.6862 0.8225 -
10.8415 189000 0.4396 1.6578 0.8212 -
10.8702 189500 0.409 1.6828 0.8242 -
10.8989 190000 0.3897 1.6644 0.8239 -
10.9276 190500 0.4072 1.6723 0.8299 -
10.9562 191000 0.4102 1.7174 0.8263 -
10.9849 191500 0.4437 1.6481 0.8246 -
11.0136 192000 0.3587 1.7165 0.8224 -
11.0423 192500 0.3295 1.6652 0.8264 -
11.0710 193000 0.3662 1.7058 0.8245 -
11.0996 193500 0.3254 1.6834 0.8217 -
11.1283 194000 0.3416 1.6786 0.8236 -
11.1570 194500 0.3161 1.7102 0.8244 -
11.1857 195000 0.3641 1.7259 0.8240 -
11.2144 195500 0.3503 1.7683 0.8211 -
11.2430 196000 0.3574 1.7092 0.8207 -
11.2717 196500 0.3519 1.7105 0.8204 -
11.3004 197000 0.3439 1.6659 0.8255 -
11.3291 197500 0.3401 1.6938 0.8194 -
11.3578 198000 0.3542 1.6713 0.8204 -
11.3865 198500 0.3451 1.6958 0.8229 -
11.4151 199000 0.3548 1.6717 0.8213 -
11.4438 199500 0.3607 1.6450 0.8270 -
11.4725 200000 0.3242 1.7143 0.8214 -
11.5012 200500 0.3547 1.6688 0.8206 -
11.5299 201000 0.3443 1.6909 0.8218 -
11.5585 201500 0.3799 1.6252 0.8224 -
11.5872 202000 0.3599 1.6647 0.8211 -
11.6159 202500 0.3385 1.6586 0.8227 -
11.6446 203000 0.3176 1.6887 0.8225 -
11.6733 203500 0.3387 1.7232 0.8247 -
11.7019 204000 0.3399 1.6772 0.8265 -
11.7306 204500 0.3491 1.7123 0.8213 -
11.7593 205000 0.3416 1.6950 0.8233 -
11.7880 205500 0.3029 1.6988 0.8207 -
11.8167 206000 0.3348 1.6667 0.8259 -
11.8454 206500 0.3491 1.6693 0.8238 -
11.8740 207000 0.3096 1.6617 0.8236 -
11.9027 207500 0.2888 1.6873 0.8261 -
11.9314 208000 0.3492 1.6676 0.8253 -
11.9601 208500 0.344 1.6592 0.8254 -
11.9888 209000 0.2991 1.6427 0.8289 -
12.0174 209500 0.2895 1.6966 0.8244 -
12.0461 210000 0.2764 1.6716 0.8227 -
12.0748 210500 0.3001 1.6863 0.8220 -
12.1035 211000 0.2832 1.6749 0.8250 -
12.1322 211500 0.2937 1.6697 0.8267 -
12.1608 212000 0.2737 1.6615 0.8236 -
12.1895 212500 0.2909 1.7160 0.8206 -
12.2182 213000 0.2847 1.6509 0.8268 -
12.2469 213500 0.2711 1.6814 0.8233 -
12.2756 214000 0.2868 1.6701 0.8241 -
12.3043 214500 0.2898 1.6717 0.8223 -
12.3329 215000 0.2847 1.7059 0.8233 -
12.3616 215500 0.3015 1.6790 0.8240 -
12.3903 216000 0.2793 1.6922 0.8261 -
12.4190 216500 0.2803 1.7192 0.8230 -
12.4477 217000 0.2892 1.6702 0.8260 -
12.4763 217500 0.2903 1.6929 0.8237 -
12.5050 218000 0.295 1.6340 0.8264 -
12.5337 218500 0.293 1.6505 0.8270 -
12.5624 219000 0.2701 1.6945 0.8271 -
12.5911 219500 0.267 1.6784 0.8278 -
12.6197 220000 0.3009 1.6514 0.8269 -
12.6484 220500 0.266 1.6717 0.8261 -
12.6771 221000 0.3 1.6844 0.8280 -
12.7058 221500 0.3059 1.6771 0.8314 -
12.7345 222000 0.2901 1.6663 0.8319 -
12.7632 222500 0.279 1.6392 0.8314 -
12.7918 223000 0.2949 1.6556 0.8270 -
12.8205 223500 0.2616 1.6746 0.8265 -
12.8492 224000 0.2809 1.6477 0.8284 -
12.8779 224500 0.2609 1.6443 0.8281 -
12.9066 225000 0.2799 1.6440 0.8274 -
12.9352 225500 0.2869 1.6878 0.8258 -
12.9639 226000 0.253 1.6778 0.8246 -
12.9926 226500 0.2926 1.6454 0.8255 -
13.0213 227000 0.2348 1.6859 0.8242 -
13.0500 227500 0.2353 1.6554 0.8231 -
13.0786 228000 0.2488 1.6847 0.8226 -
13.1073 228500 0.259 1.6820 0.8255 -
13.1360 229000 0.2341 1.6892 0.8237 -
13.1647 229500 0.2603 1.7153 0.8228 -
13.1934 230000 0.2411 1.6844 0.8235 -
13.2221 230500 0.2626 1.6940 0.8240 -
13.2507 231000 0.241 1.6811 0.8247 -
13.2794 231500 0.2342 1.6801 0.8262 -
13.3081 232000 0.2334 1.6911 0.8261 -
13.3368 232500 0.2575 1.6722 0.8236 -
13.3655 233000 0.2329 1.6650 0.8244 -
13.3941 233500 0.2547 1.6775 0.8251 -
13.4228 234000 0.2234 1.6631 0.8239 -
13.4515 234500 0.2365 1.6691 0.8235 -
13.4802 235000 0.2268 1.7275 0.8231 -
13.5089 235500 0.2306 1.6805 0.8245 -
13.5375 236000 0.2388 1.6765 0.8258 -
13.5662 236500 0.2474 1.6769 0.8240 -
13.5949 237000 0.2499 1.7176 0.8228 -
13.6236 237500 0.2406 1.6807 0.8241 -
13.6523 238000 0.2481 1.7075 0.8234 -
13.6809 238500 0.2472 1.6630 0.8234 -
13.7096 239000 0.231 1.7123 0.8238 -
13.7383 239500 0.2294 1.6875 0.8243 -
13.7670 240000 0.2459 1.7007 0.8250 -
13.7957 240500 0.2512 1.6751 0.8278 -
13.8244 241000 0.2355 1.7079 0.8262 -
13.8530 241500 0.2265 1.7144 0.8263 -
13.8817 242000 0.2324 1.7026 0.8268 -
13.9104 242500 0.2299 1.6978 0.8273 -
13.9391 243000 0.2362 1.7243 0.8267 -
13.9678 243500 0.2315 1.6821 0.8290 -
13.9964 244000 0.2386 1.7134 0.8270 -
14.0251 244500 0.2062 1.6998 0.8269 -
14.0538 245000 0.219 1.7169 0.8249 -
14.0825 245500 0.2071 1.7173 0.8264 -
14.1112 246000 0.2178 1.7058 0.8257 -
14.1398 246500 0.2071 1.7181 0.8251 -
14.1685 247000 0.1918 1.7252 0.8243 -
14.1972 247500 0.2307 1.7096 0.8241 -
14.2259 248000 0.2288 1.7527 0.8235 -
14.2546 248500 0.2097 1.7030 0.8250 -
14.2833 249000 0.2275 1.7006 0.8249 -
14.3119 249500 0.2361 1.7337 0.8235 -
14.3406 250000 0.2023 1.7084 0.8234 -
14.3693 250500 0.2112 1.7090 0.8232 -
14.3980 251000 0.2193 1.7033 0.8241 -
14.4267 251500 0.2157 1.7041 0.8236 -
14.4553 252000 0.2059 1.7023 0.8236 -
14.4840 252500 0.194 1.7170 0.8240 -
14.5127 253000 0.1852 1.7050 0.8246 -
14.5414 253500 0.2043 1.7011 0.8246 -
14.5701 254000 0.2103 1.7024 0.8245 -
14.5987 254500 0.1906 1.7177 0.8242 -
14.6274 255000 0.2176 1.7233 0.8237 -
14.6561 255500 0.2065 1.7247 0.8231 -
14.6848 256000 0.222 1.7163 0.8236 -
14.7135 256500 0.2234 1.7166 0.8232 -
14.7422 257000 0.2093 1.7230 0.8223 -
14.7708 257500 0.2321 1.7148 0.8222 -
14.7995 258000 0.2046 1.7102 0.8225 -
14.8282 258500 0.1773 1.7073 0.8230 -
14.8569 259000 0.1961 1.7131 0.8231 -
14.8856 259500 0.2092 1.7097 0.8232 -
14.9142 260000 0.208 1.7093 0.8232 -
14.9429 260500 0.2159 1.7110 0.8230 -
14.9716 261000 0.2106 1.7138 0.8229 -
-1 -1 - - - 0.7977

Framework Versions

  • Python: 3.13.0
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
20
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sobamchan/roberta-large-mrl-768-512-256-128-64

Finetuned
(453)
this model

Dataset used to train sobamchan/roberta-large-mrl-768-512-256-128-64

Papers for sobamchan/roberta-large-mrl-768-512-256-128-64

Evaluation results