SentenceTransformer based on google-bert/bert-large-uncased

This is a sentence-transformers model finetuned from google-bert/bert-large-uncased on the all-nli dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: google-bert/bert-large-uncased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'A construction worker peeking out of a manhole while his coworker sits on the sidewalk smiling.',
    'A worker is looking out of a manhole.',
    'The workers are both inside the manhole.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8793, 0.6419],
#         [0.8793, 1.0000, 0.6977],
#         [0.6419, 0.6977, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.4894 0.5408
spearman_cosine 0.5974 0.5987

Training Details

Training Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 557,850 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 10.46 tokens
    • max: 46 tokens
    • min: 6 tokens
    • mean: 12.81 tokens
    • max: 40 tokens
    • min: 5 tokens
    • mean: 13.4 tokens
    • max: 50 tokens
  • Samples:
    anchor positive negative
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse. A person is at a diner, ordering an omelette.
    Children smiling and waving at camera There are children present The kids are frowning
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick. The boy skates down the sidewalk.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 6,584 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 17.95 tokens
    • max: 63 tokens
    • min: 4 tokens
    • mean: 9.78 tokens
    • max: 29 tokens
    • min: 5 tokens
    • mean: 10.35 tokens
    • max: 29 tokens
  • Samples:
    anchor positive negative
    Two women are embracing while holding to go packages. Two woman are holding packages. The men are fighting outside a deli.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands. Two kids in jackets walk to school.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer. A woman drinks her coffee in a small cafe.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 15
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.5941 -
0.0287 500 10.355 4.0242 0.8005 -
0.0574 1000 4.8552 2.7486 0.8316 -
0.0860 1500 3.7074 2.1101 0.8432 -
0.1147 2000 3.1868 1.8142 0.8444 -
0.1434 2500 2.8527 1.6569 0.8495 -
0.1721 3000 2.6613 1.5804 0.8531 -
0.2008 3500 2.5256 1.4724 0.8501 -
0.2294 4000 2.3472 1.4474 0.8488 -
0.2581 4500 2.3468 1.3981 0.8544 -
0.2868 5000 2.2274 1.3540 0.8525 -
0.3155 5500 2.1392 1.3047 0.8573 -
0.3442 6000 2.14 1.2833 0.8573 -
0.3729 6500 2.0972 1.3091 0.8455 -
0.4015 7000 2.037 1.2466 0.8551 -
0.4302 7500 1.9468 1.2238 0.8457 -
0.4589 8000 1.8828 1.2378 0.8560 -
0.4876 8500 1.9429 1.2359 0.8506 -
0.5163 9000 1.9303 1.2447 0.8464 -
0.5449 9500 1.8625 1.2377 0.8480 -
0.5736 10000 1.7446 1.2215 0.8509 -
0.6023 10500 1.8013 1.2452 0.8491 -
0.6310 11000 1.7362 1.1845 0.8527 -
0.6597 11500 1.7445 1.2486 0.8427 -
0.6883 12000 1.7057 1.2014 0.8417 -
0.7170 12500 1.7171 1.2094 0.8464 -
0.7457 13000 1.7044 1.1872 0.8490 -
0.7744 13500 1.6819 1.2317 0.8390 -
0.8031 14000 1.6481 1.3047 0.8405 -
0.8318 14500 1.6511 1.2340 0.8511 -
0.8604 15000 1.6401 1.2043 0.8460 -
0.8891 15500 1.842 1.2450 0.8461 -
0.9178 16000 1.6811 1.2516 0.8556 -
0.9465 16500 1.6498 1.2838 0.8408 -
0.9752 17000 1.5387 1.2799 0.8419 -
1.0038 17500 1.5559 1.2691 0.8415 -
1.0325 18000 1.3248 1.2838 0.8460 -
1.0612 18500 1.3448 1.3150 0.8418 -
1.0899 19000 1.3609 1.2810 0.8377 -
1.1186 19500 1.399 1.2890 0.8490 -
1.1472 20000 1.425 1.3231 0.8464 -
1.1759 20500 1.4137 1.2938 0.8436 -
1.2046 21000 1.4393 1.3540 0.8398 -
1.2333 21500 1.4703 1.3168 0.8487 -
1.2620 22000 1.3895 1.3137 0.8449 -
1.2907 22500 1.4223 1.4062 0.8323 -
1.3193 23000 1.3869 1.3827 0.8366 -
1.3480 23500 1.4603 1.3854 0.8324 -
1.3767 24000 1.4658 1.3904 0.8328 -
1.4054 24500 1.4597 1.3903 0.8360 -
1.4341 25000 1.4348 1.4095 0.8352 -
1.4627 25500 1.4981 1.4556 0.8287 -
1.4914 26000 1.4574 1.5016 0.8255 -
1.5201 26500 1.4481 1.4601 0.8259 -
1.5488 27000 1.4383 1.4477 0.8301 -
1.5775 27500 1.5674 1.4995 0.8217 -
1.6061 28000 1.4565 1.4395 0.8324 -
1.6348 28500 1.5055 1.4260 0.8256 -
1.6635 29000 1.4896 1.4780 0.8383 -
1.6922 29500 1.4624 1.4142 0.8292 -
1.7209 30000 1.5277 1.4614 0.8240 -
1.7496 30500 1.4629 1.4094 0.8214 -
1.7782 31000 1.4363 1.3851 0.8330 -
1.8069 31500 1.4829 1.4118 0.8274 -
1.8356 32000 1.4333 1.4059 0.8120 -
1.8643 32500 1.4825 1.4340 0.8177 -
1.8930 33000 4.0454 1.4061 0.8243 -
1.9216 33500 1.4058 1.4723 0.8149 -
1.9503 34000 1.4291 1.4224 0.8223 -
1.9790 34500 1.8112 1.4338 0.7975 -
2.0077 35000 1.3598 1.4007 0.8167 -
2.0364 35500 1.0655 1.4467 0.8141 -
2.0650 36000 1.1357 1.4624 0.8219 -
2.0937 36500 1.1154 1.4044 0.8270 -
2.1224 37000 1.1348 1.4766 0.8262 -
2.1511 37500 1.1386 1.3919 0.8156 -
2.1798 38000 1.1874 1.4432 0.8238 -
2.2085 38500 1.1021 1.3983 0.8192 -
2.2371 39000 1.0822 1.4112 0.8054 -
2.2658 39500 1.1119 1.4791 0.8141 -
2.2945 40000 1.0663 1.4410 0.8157 -
2.3232 40500 1.1075 1.4826 0.8155 -
2.3519 41000 1.1116 1.5526 0.8124 -
2.3805 41500 1.1362 1.5145 0.8163 -
2.4092 42000 1.0799 1.4402 0.8211 -
2.4379 42500 1.0442 1.4477 0.8139 -
2.4666 43000 1.0819 1.4042 0.8046 -
2.4953 43500 1.0698 1.3716 0.8079 -
2.5239 44000 1.0722 1.3874 0.8146 -
2.5526 44500 1.0899 1.4420 0.8061 -
2.5813 45000 1.1281 1.3978 0.8160 -
2.6100 45500 1.0868 1.3467 0.8134 -
2.6387 46000 1.1829 1.3448 0.8095 -
2.6674 46500 1.1077 1.4623 0.8056 -
2.6960 47000 1.0832 1.4492 0.8156 -
2.7247 47500 1.1232 1.4450 0.8086 -
2.7534 48000 1.1361 1.3286 0.8257 -
2.7821 48500 1.0833 1.3992 0.8128 -
2.8108 49000 1.0762 1.3608 0.8170 -
2.8394 49500 1.0488 1.3706 0.8034 -
2.8681 50000 1.0635 1.3795 0.7940 -
2.8968 50500 1.0864 1.4441 0.8105 -
2.9255 51000 1.0826 1.4043 0.8022 -
2.9542 51500 1.0417 1.4268 0.8029 -
2.9828 52000 1.016 1.3703 0.8037 -
3.0115 52500 0.9401 1.4121 0.8042 -
3.0402 53000 0.8011 1.3880 0.7993 -
3.0689 53500 0.8216 1.3835 0.7995 -
3.0976 54000 0.8117 1.3809 0.8003 -
3.1263 54500 0.837 1.3512 0.8032 -
3.1549 55000 0.8256 1.3367 0.8117 -
3.1836 55500 0.8347 1.3854 0.7994 -
3.2123 56000 0.8285 1.3948 0.7833 -
3.2410 56500 0.8318 1.4231 0.7792 -
3.2697 57000 0.8414 1.4341 0.7720 -
3.2983 57500 0.7978 1.3501 0.7851 -
3.3270 58000 0.8374 1.3984 0.7787 -
3.3557 58500 0.8594 1.4647 0.7812 -
3.3844 59000 0.8458 1.4336 0.7758 -
3.4131 59500 0.8037 1.3944 0.7833 -
3.4417 60000 0.769 1.4044 0.7844 -
3.4704 60500 0.8258 1.3334 0.7725 -
3.4991 61000 0.8062 1.4497 0.7795 -
3.5278 61500 0.7956 1.3636 0.7869 -
3.5565 62000 0.862 1.3716 0.7891 -
3.5852 62500 0.8563 1.4215 0.7891 -
3.6138 63000 0.8313 1.3704 0.7939 -
3.6425 63500 0.9683 1.4436 0.7888 -
3.6712 64000 0.9055 1.4131 0.7905 -
3.6999 64500 0.842 1.4499 0.7862 -
3.7286 65000 0.8213 1.3862 0.8044 -
3.7572 65500 0.9589 1.3736 0.7886 -
3.7859 66000 0.8708 1.4274 0.7712 -
3.8146 66500 0.8578 1.3912 0.7696 -
3.8433 67000 0.873 1.4481 0.7865 -
3.8720 67500 0.8429 1.4216 0.7892 -
3.9006 68000 0.8215 1.3929 0.7576 -
3.9293 68500 0.7798 1.4538 0.7569 -
3.9580 69000 0.7911 1.4156 0.7859 -
3.9867 69500 0.8025 1.4104 0.7734 -
4.0154 70000 0.7144 1.4634 0.7706 -
4.0441 70500 0.6361 1.4545 0.7732 -
4.0727 71000 0.6433 1.4451 0.7734 -
4.1014 71500 0.6312 1.4321 0.7625 -
4.1301 72000 0.6169 1.4719 0.7749 -
4.1588 72500 0.6817 1.4117 0.7842 -
4.1875 73000 0.6209 1.4916 0.7582 -
4.2161 73500 0.645 1.4343 0.7686 -
4.2448 74000 0.6219 1.4041 0.7811 -
4.2735 74500 0.5946 1.4836 0.7688 -
4.3022 75000 0.625 1.4116 0.7675 -
4.3309 75500 0.6402 1.4092 0.7679 -
4.3595 76000 0.6537 1.4389 0.7753 -
4.3882 76500 0.6529 1.4040 0.7746 -
4.4169 77000 0.6648 1.4266 0.7773 -
4.4456 77500 0.6299 1.4609 0.7708 -
4.4743 78000 0.6426 1.4726 0.7470 -
4.5030 78500 0.6468 1.4197 0.7700 -
4.5316 79000 0.639 1.3696 0.7590 -
4.5603 79500 0.6359 1.4427 0.7592 -
4.5890 80000 0.6496 1.3982 0.7587 -
4.6177 80500 0.6946 1.4384 0.7640 -
4.6464 81000 0.6609 1.4581 0.7650 -
4.6750 81500 0.6488 1.4007 0.7689 -
4.7037 82000 0.6584 1.3845 0.7729 -
4.7324 82500 0.6143 1.4110 0.7556 -
4.7611 83000 0.6226 1.4088 0.7568 -
4.7898 83500 0.6351 1.3596 0.7580 -
4.8184 84000 0.6427 1.3896 0.7652 -
4.8471 84500 0.6657 1.4087 0.7523 -
4.8758 85000 0.6768 1.4284 0.7508 -
4.9045 85500 0.6685 1.4374 0.7560 -
4.9332 86000 0.647 1.3814 0.7611 -
4.9619 86500 0.625 1.4617 0.7552 -
4.9905 87000 0.627 1.4735 0.7452 -
5.0192 87500 0.5423 1.5290 0.7403 -
5.0479 88000 0.5088 1.3569 0.7596 -
5.0766 88500 0.5126 1.4418 0.7657 -
5.1053 89000 0.5021 1.3692 0.7591 -
5.1339 89500 0.4965 1.3838 0.7532 -
5.1626 90000 0.4897 1.3873 0.7635 -
5.1913 90500 0.5253 1.4022 0.7538 -
5.2200 91000 0.4859 1.3879 0.7645 -
5.2487 91500 0.481 1.4570 0.7545 -
5.2773 92000 0.5361 1.3843 0.7576 -
5.3060 92500 0.4917 1.4077 0.7509 -
5.3347 93000 0.5417 1.4428 0.7522 -
5.3634 93500 0.5235 1.3454 0.7616 -
5.3921 94000 0.5352 1.4935 0.7463 -
5.4208 94500 0.5046 1.4337 0.7622 -
5.4494 95000 0.5446 1.4203 0.7581 -
5.4781 95500 0.5185 1.3561 0.7648 -
5.5068 96000 0.5157 1.3719 0.7664 -
5.5355 96500 0.5284 1.4197 0.7620 -
5.5642 97000 0.5203 1.4400 0.7461 -
5.5928 97500 0.5004 1.4245 0.7488 -
5.6215 98000 0.5125 1.4085 0.7532 -
5.6502 98500 0.509 1.3631 0.7241 -
5.6789 99000 0.512 1.3855 0.7420 -
5.7076 99500 0.5208 1.3507 0.7389 -
5.7362 100000 0.5348 1.3741 0.7426 -
5.7649 100500 0.4981 1.3742 0.7434 -
5.7936 101000 0.4911 1.4284 0.7357 -
5.8223 101500 0.5198 1.4075 0.7425 -
5.8510 102000 0.5051 1.4203 0.7461 -
5.8797 102500 0.5021 1.3820 0.7437 -
5.9083 103000 0.5322 1.3781 0.7390 -
5.9370 103500 0.5013 1.3651 0.7555 -
5.9657 104000 0.5596 1.4418 0.7395 -
5.9944 104500 0.5032 1.4456 0.7254 -
6.0231 105000 0.439 1.5053 0.7176 -
6.0517 105500 0.3857 1.4350 0.7378 -
6.0804 106000 0.3577 1.4328 0.7171 -
6.1091 106500 0.4147 1.3704 0.7352 -
6.1378 107000 0.392 1.3877 0.7454 -
6.1665 107500 0.3889 1.4204 0.7323 -
6.1951 108000 0.407 1.3918 0.7390 -
6.2238 108500 0.4371 1.3977 0.7471 -
6.2525 109000 0.4026 1.4101 0.7316 -
6.2812 109500 0.4274 1.3953 0.7051 -
6.3099 110000 0.4131 1.4413 0.7267 -
6.3386 110500 0.9697 3.2298 0.7471 -
6.3672 111000 1.4298 3.0441 0.7370 -
6.3959 111500 1.4607 2.9880 0.7238 -
6.4246 112000 1.4573 2.9814 0.7088 -
6.4533 112500 1.472 2.8932 0.7224 -
6.4820 113000 1.4724 2.9743 0.7097 -
6.5106 113500 1.4518 2.9786 0.7057 -
6.5393 114000 1.3914 2.9617 0.6845 -
6.5680 114500 1.3547 2.9814 0.7040 -
6.5967 115000 1.3411 2.9400 0.7066 -
6.6254 115500 1.39 2.9816 0.7048 -
6.6540 116000 1.3326 2.9411 0.7132 -
6.6827 116500 1.3337 2.8797 0.6924 -
6.7114 117000 1.3782 3.0356 0.7177 -
6.7401 117500 0.9712 1.4070 0.7304 -
6.7688 118000 0.4331 1.4761 0.7016 -
6.7975 118500 0.4031 1.4213 0.7309 -
6.8261 119000 0.4264 1.4299 0.6950 -
6.8548 119500 0.3823 1.4266 0.7068 -
6.8835 120000 0.3975 1.4301 0.6907 -
6.9122 120500 0.4112 1.4532 0.6967 -
6.9409 121000 0.4145 1.4440 0.7171 -
6.9695 121500 0.4133 1.4214 0.7120 -
6.9982 122000 0.3889 1.3826 0.7209 -
7.0269 122500 0.3498 1.4130 0.7032 -
7.0556 123000 0.3592 1.3871 0.7060 -
7.0843 123500 0.324 1.4356 0.6964 -
7.1129 124000 0.3251 1.3874 0.7183 -
7.1416 124500 0.3473 1.4526 0.7063 -
7.1703 125000 0.3474 1.4215 0.7202 -
7.1990 125500 0.3367 1.5467 0.7001 -
7.2277 126000 0.3552 1.4116 0.7082 -
7.2564 126500 0.3106 1.4911 0.6844 -
7.2850 127000 0.3229 1.4940 0.6790 -
7.3137 127500 0.3281 1.4836 0.6900 -
7.3424 128000 0.325 1.4591 0.6882 -
7.3711 128500 0.3486 1.4907 0.7000 -
7.3998 129000 0.3432 1.4869 0.6770 -
7.4284 129500 0.34 1.4619 0.6886 -
7.4571 130000 0.3309 1.4884 0.6911 -
7.4858 130500 0.3391 1.4638 0.6868 -
7.5145 131000 0.3579 1.4425 0.7019 -
7.5432 131500 0.3261 1.4337 0.7004 -
7.5718 132000 0.3319 1.4724 0.6950 -
7.6005 132500 0.322 1.4390 0.7111 -
7.6292 133000 0.3961 1.4350 0.7082 -
7.6579 133500 0.3332 1.4276 0.7040 -
7.6866 134000 0.3773 1.4000 0.7102 -
7.7153 134500 0.3533 1.3680 0.7158 -
7.7439 135000 0.3403 1.4344 0.7351 -
7.7726 135500 0.3292 1.4417 0.7141 -
7.8013 136000 0.3444 1.4834 0.7175 -
7.8300 136500 0.3333 1.4475 0.7148 -
7.8587 137000 0.3219 1.5042 0.7096 -
7.8873 137500 0.3235 1.4297 0.7155 -
7.9160 138000 0.3382 1.4324 0.6983 -
7.9447 138500 0.3423 1.4360 0.6982 -
7.9734 139000 0.329 1.4325 0.6985 -
8.0021 139500 0.3323 1.4369 0.6963 -
8.0307 140000 0.2697 1.4855 0.7000 -
8.0594 140500 0.2602 1.4832 0.6916 -
8.0881 141000 0.2797 1.4846 0.7020 -
8.1168 141500 0.2794 1.4313 0.7004 -
8.1455 142000 0.2707 1.3881 0.7091 -
8.1742 142500 0.265 1.4229 0.7040 -
8.2028 143000 0.2594 1.4730 0.6874 -
8.2315 143500 0.2837 1.4256 0.6865 -
8.2602 144000 0.2851 1.4146 0.7036 -
8.2889 144500 0.2931 1.4502 0.6793 -
8.3176 145000 0.2715 1.4532 0.6775 -
8.3462 145500 0.2727 1.3900 0.7078 -
8.3749 146000 0.2719 1.3988 0.6948 -
8.4036 146500 0.2727 1.4218 0.6851 -
8.4323 147000 0.2643 1.4021 0.6888 -
8.4610 147500 0.2791 1.4483 0.6911 -
8.4896 148000 0.3177 1.4896 0.6745 -
8.5183 148500 0.3015 1.4526 0.6925 -
8.5470 149000 0.2851 1.4712 0.6938 -
8.5757 149500 0.2856 1.4443 0.6721 -
8.6044 150000 0.2523 1.4120 0.6756 -
8.6331 150500 0.2846 1.4410 0.7024 -
8.6617 151000 0.3001 1.4339 0.6762 -
8.6904 151500 0.2834 1.3906 0.7012 -
8.7191 152000 0.2838 1.3978 0.6902 -
8.7478 152500 0.2685 1.4554 0.6648 -
8.7765 153000 0.2632 1.4355 0.6953 -
8.8051 153500 0.2802 1.4225 0.6903 -
8.8338 154000 0.2659 1.4520 0.6762 -
8.8625 154500 0.2705 1.4594 0.6805 -
8.8912 155000 0.2893 1.4607 0.6811 -
8.9199 155500 0.2665 1.4272 0.6871 -
8.9485 156000 0.2593 1.4704 0.6788 -
8.9772 156500 0.2889 1.4628 0.6833 -
9.0059 157000 0.3095 1.5287 0.6839 -
9.0346 157500 0.2102 1.4937 0.6635 -
9.0633 158000 0.2281 1.4779 0.6709 -
9.0920 158500 0.2121 1.5082 0.6606 -
9.1206 159000 0.218 1.4729 0.6635 -
9.1493 159500 0.2376 1.4809 0.6668 -
9.1780 160000 0.2298 1.4782 0.6555 -
9.2067 160500 0.2426 1.4985 0.6794 -
9.2354 161000 0.2406 1.5425 0.6585 -
9.2640 161500 0.2165 1.5310 0.6624 -
9.2927 162000 0.2453 1.5199 0.6515 -
9.3214 162500 0.22 1.4485 0.6724 -
9.3501 163000 0.2159 1.5232 0.6505 -
9.3788 163500 0.2209 1.5175 0.6577 -
9.4074 164000 0.2226 1.4641 0.6742 -
9.4361 164500 0.2201 1.4779 0.6609 -
9.4648 165000 0.2204 1.5040 0.6653 -
9.4935 165500 0.2298 1.4994 0.6671 -
9.5222 166000 0.2415 1.5155 0.6610 -
9.5509 166500 0.2381 1.4781 0.6704 -
9.5795 167000 0.2318 1.4648 0.6551 -
9.6082 167500 0.2278 1.4846 0.6539 -
9.6369 168000 0.2245 1.4535 0.6645 -
9.6656 168500 0.2277 1.4760 0.6800 -
9.6943 169000 0.2152 1.4372 0.6724 -
9.7229 169500 0.2389 1.4583 0.6555 -
9.7516 170000 0.2229 1.4446 0.6619 -
9.7803 170500 0.246 1.4573 0.6435 -
9.8090 171000 0.2259 1.4830 0.6577 -
9.8377 171500 0.2104 1.4652 0.6518 -
9.8663 172000 0.2349 1.4833 0.6492 -
9.8950 172500 0.2139 1.4486 0.6749 -
9.9237 173000 0.2128 1.4969 0.6594 -
9.9524 173500 0.2209 1.4962 0.6539 -
9.9811 174000 0.223 1.5008 0.6706 -
10.0098 174500 0.194 1.5453 0.6578 -
10.0384 175000 0.1937 1.5244 0.6698 -
10.0671 175500 0.1893 1.5554 0.6551 -
10.0958 176000 0.1981 1.5355 0.6606 -
10.1245 176500 0.2051 1.5436 0.6501 -
10.1532 177000 0.2045 1.5270 0.6738 -
10.1818 177500 0.1821 1.5228 0.6604 -
10.2105 178000 0.1953 1.5424 0.6763 -
10.2392 178500 0.1872 1.5510 0.6620 -
10.2679 179000 0.2022 1.5117 0.6694 -
10.2966 179500 0.18 1.4946 0.6693 -
10.3252 180000 0.2026 1.5164 0.6580 -
10.3539 180500 0.2018 1.5015 0.6486 -
10.3826 181000 0.2184 1.5314 0.6388 -
10.4113 181500 0.1921 1.4772 0.6574 -
10.4400 182000 0.2074 1.4927 0.6555 -
10.4687 182500 0.1785 1.4927 0.6465 -
10.4973 183000 0.1688 1.4810 0.6602 -
10.5260 183500 0.1724 1.5047 0.6662 -
10.5547 184000 0.1741 1.5367 0.6549 -
10.5834 184500 0.1812 1.5166 0.6570 -
10.6121 185000 0.1869 1.5155 0.6492 -
10.6407 185500 0.1969 1.5284 0.6466 -
10.6694 186000 0.1883 1.4915 0.6733 -
10.6981 186500 0.1874 1.4977 0.6642 -
10.7268 187000 0.1914 1.4691 0.6627 -
10.7555 187500 0.1827 1.4595 0.6637 -
10.7841 188000 0.197 1.4824 0.6610 -
10.8128 188500 0.181 1.4731 0.6520 -
10.8415 189000 0.1964 1.4987 0.6540 -
10.8702 189500 0.1855 1.5029 0.6496 -
10.8989 190000 0.183 1.5363 0.6454 -
10.9276 190500 0.1881 1.5226 0.6651 -
10.9562 191000 0.1825 1.5043 0.6434 -
10.9849 191500 0.2019 1.4725 0.6582 -
11.0136 192000 0.1438 1.5152 0.6437 -
11.0423 192500 0.1464 1.4943 0.6388 -
11.0710 193000 0.1705 1.5132 0.6454 -
11.0996 193500 0.1631 1.5132 0.6551 -
11.1283 194000 0.1768 1.5080 0.6595 -
11.1570 194500 0.1477 1.5361 0.6460 -
11.1857 195000 0.184 1.4982 0.6514 -
11.2144 195500 0.1708 1.5617 0.6365 -
11.2430 196000 0.167 1.5113 0.6322 -
11.2717 196500 0.1607 1.5306 0.6305 -
11.3004 197000 0.1693 1.5225 0.6419 -
11.3291 197500 0.1613 1.5391 0.6309 -
11.3578 198000 0.1852 1.5269 0.6235 -
11.3865 198500 0.1533 1.5608 0.6388 -
11.4151 199000 0.1599 1.5506 0.6331 -
11.4438 199500 0.169 1.5540 0.6322 -
11.4725 200000 0.1523 1.5429 0.6306 -
11.5012 200500 0.1701 1.5451 0.6203 -
11.5299 201000 0.1647 1.5329 0.6218 -
11.5585 201500 0.1839 1.5192 0.6252 -
11.5872 202000 0.1767 1.5246 0.6336 -
11.6159 202500 0.1527 1.5210 0.6286 -
11.6446 203000 0.1497 1.5556 0.6316 -
11.6733 203500 0.1529 1.5994 0.6194 -
11.7019 204000 0.1568 1.5244 0.6249 -
11.7306 204500 0.1665 1.5081 0.6386 -
11.7593 205000 0.1633 1.5250 0.6336 -
11.7880 205500 0.1405 1.5075 0.6298 -
11.8167 206000 0.161 1.5371 0.6249 -
11.8454 206500 0.1586 1.5500 0.6354 -
11.8740 207000 0.1432 1.5284 0.6338 -
11.9027 207500 0.1354 1.5602 0.6346 -
11.9314 208000 0.1742 1.5325 0.6387 -
11.9601 208500 0.1546 1.5484 0.6351 -
11.9888 209000 0.1384 1.5627 0.6267 -
12.0174 209500 0.1422 1.5397 0.6369 -
12.0461 210000 0.1331 1.5993 0.6195 -
12.0748 210500 0.1447 1.6290 0.6175 -
12.1035 211000 0.1415 1.6163 0.6189 -
12.1322 211500 0.1379 1.5928 0.6192 -
12.1608 212000 0.14 1.6243 0.6019 -
12.1895 212500 0.1507 1.5876 0.6104 -
12.2182 213000 0.1257 1.5566 0.6150 -
12.2469 213500 0.1327 1.5573 0.6239 -
12.2756 214000 0.129 1.5612 0.6219 -
12.3043 214500 0.133 1.5828 0.6237 -
12.3329 215000 0.1374 1.5436 0.6276 -
12.3616 215500 0.1458 1.5864 0.6240 -
12.3903 216000 0.1364 1.6091 0.6191 -
12.4190 216500 0.1403 1.5761 0.6275 -
12.4477 217000 0.1459 1.5579 0.6373 -
12.4763 217500 0.1404 1.5792 0.6264 -
12.5050 218000 0.1496 1.5667 0.6222 -
12.5337 218500 0.1353 1.5411 0.6303 -
12.5624 219000 0.1325 1.5999 0.6128 -
12.5911 219500 0.1284 1.5736 0.6277 -
12.6197 220000 0.1618 1.5806 0.6223 -
12.6484 220500 0.13 1.5894 0.6258 -
12.6771 221000 0.1524 1.5816 0.6242 -
12.7058 221500 0.1372 1.6060 0.6098 -
12.7345 222000 0.1413 1.5833 0.6182 -
12.7632 222500 0.1332 1.6123 0.6044 -
12.7918 223000 0.1419 1.5952 0.6133 -
12.8205 223500 0.1294 1.6072 0.6172 -
12.8492 224000 0.1217 1.6113 0.6201 -
12.8779 224500 0.1282 1.5796 0.6298 -
12.9066 225000 0.1368 1.5873 0.6186 -
12.9352 225500 0.1366 1.5937 0.6183 -
12.9639 226000 0.126 1.5977 0.6112 -
12.9926 226500 0.1455 1.5434 0.6130 -
13.0213 227000 0.1158 1.5835 0.6062 -
13.0500 227500 0.1173 1.5982 0.6068 -
13.0786 228000 0.1227 1.5917 0.6023 -
13.1073 228500 0.1261 1.6078 0.5983 -
13.1360 229000 0.1091 1.6149 0.6072 -
13.1647 229500 0.1394 1.5894 0.6118 -
13.1934 230000 0.1303 1.5938 0.6075 -
13.2221 230500 0.1284 1.5855 0.6138 -
13.2507 231000 0.1242 1.6000 0.6106 -
13.2794 231500 0.112 1.6341 0.6055 -
13.3081 232000 0.1188 1.6140 0.6008 -
13.3368 232500 0.1386 1.6054 0.5970 -
13.3655 233000 0.1122 1.5873 0.6058 -
13.3941 233500 0.1245 1.5915 0.6038 -
13.4228 234000 0.1055 1.5970 0.6061 -
13.4515 234500 0.1184 1.5804 0.6127 -
13.4802 235000 0.1151 1.5959 0.6071 -
13.5089 235500 0.109 1.5995 0.6032 -
13.5375 236000 0.1154 1.5953 0.6065 -
13.5662 236500 0.1279 1.5881 0.6042 -
13.5949 237000 0.1238 1.5852 0.6022 -
13.6236 237500 0.1249 1.6056 0.6069 -
13.6523 238000 0.1258 1.6175 0.5998 -
13.6809 238500 0.1151 1.6109 0.6029 -
13.7096 239000 0.1276 1.6139 0.6011 -
13.7383 239500 0.1151 1.6032 0.6002 -
13.7670 240000 0.1291 1.5745 0.6055 -
13.7957 240500 0.1225 1.6236 0.6009 -
13.8244 241000 0.1088 1.6303 0.5968 -
13.8530 241500 0.1121 1.6098 0.6028 -
13.8817 242000 0.1235 1.5949 0.6014 -
13.9104 242500 0.113 1.6113 0.6013 -
13.9391 243000 0.1241 1.5945 0.6018 -
13.9678 243500 0.115 1.5894 0.6051 -
13.9964 244000 0.1219 1.5866 0.6074 -
14.0251 244500 0.1069 1.6148 0.6028 -
14.0538 245000 0.1145 1.6099 0.5967 -
14.0825 245500 0.1051 1.6074 0.6007 -
14.1112 246000 0.1069 1.6249 0.5948 -
14.1398 246500 0.1077 1.6126 0.5956 -
14.1685 247000 0.0948 1.6076 0.6037 -
14.1972 247500 0.1157 1.6284 0.5976 -
14.2259 248000 0.1196 1.6390 0.5979 -
14.2546 248500 0.1139 1.6163 0.5997 -
14.2833 249000 0.1165 1.6112 0.5975 -
14.3119 249500 0.1222 1.6213 0.5978 -
14.3406 250000 0.0947 1.6392 0.5958 -
14.3693 250500 0.0986 1.6357 0.5956 -
14.3980 251000 0.1102 1.6300 0.6016 -
14.4267 251500 0.1083 1.6390 0.5959 -
14.4553 252000 0.1147 1.6280 0.5976 -
14.4840 252500 0.0964 1.6362 0.5961 -
14.5127 253000 0.0904 1.6170 0.5964 -
14.5414 253500 0.1052 1.6171 0.5960 -
14.5701 254000 0.1064 1.6203 0.5971 -
14.5987 254500 0.0983 1.6127 0.5996 -
14.6274 255000 0.1118 1.6087 0.6014 -
14.6561 255500 0.1058 1.6164 0.6019 -
14.6848 256000 0.1135 1.6262 0.5986 -
14.7135 256500 0.1112 1.6155 0.6013 -
14.7422 257000 0.1097 1.6194 0.5994 -
14.7708 257500 0.1144 1.6188 0.5982 -
14.7995 258000 0.1026 1.6155 0.5984 -
14.8282 258500 0.0856 1.6180 0.5983 -
14.8569 259000 0.1061 1.6254 0.5977 -
14.8856 259500 0.1146 1.6255 0.5979 -
14.9142 260000 0.1067 1.6243 0.5978 -
14.9429 260500 0.1058 1.6253 0.5974 -
14.9716 261000 0.1163 1.6241 0.5974 -
-1 -1 - - - 0.5987

Framework Versions

  • Python: 3.13.0
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sobamchan/bert-large-uncased-mrl-768-512-256-128-64

Finetuned
(165)
this model

Dataset used to train sobamchan/bert-large-uncased-mrl-768-512-256-128-64

Collection including sobamchan/bert-large-uncased-mrl-768-512-256-128-64

Papers for sobamchan/bert-large-uncased-mrl-768-512-256-128-64

Evaluation results