SentenceTransformer based on sentence-transformers/LaBSE

This is a sentence-transformers model finetuned from sentence-transformers/LaBSE. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/LaBSE
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
  (3): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    "Word: a l o o h l| Context: I i   ' n i i   y a t s h l   p i p e - -   n i i   g y a ' a   w i l   s g i h l   p i p e   a l o o h l   h a ' n i i y o ' o x s x w .| Translation: And I hit the pipe-- I saw there was a pipe on the sink.",
    'Morpheme: h l | Gloss: CN',
    'Morpheme: i i | Gloss: CCNJ',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6834, 0.3528],
#         [0.6834, 1.0000, 0.4257],
#         [0.3528, 0.4257, 1.0000]])

Evaluation

Metrics

IR

  • Dataset: validation
  • Evaluated with main.IREvaluatorWithLogging
Metric Value
cosine_accuracy@1 0.7584
cosine_accuracy@3 0.8959
cosine_accuracy@5 0.9331
cosine_accuracy@10 0.9591
cosine_precision@1 0.7584
cosine_precision@3 0.3656
cosine_precision@5 0.2335
cosine_precision@10 0.1223
cosine_recall@1 0.6247
cosine_recall@3 0.8255
cosine_recall@5 0.8721
cosine_recall@10 0.9062
cosine_ndcg@10 0.8269
cosine_mrr@10 0.8325
cosine_map@100 0.7885

Training Details

Training Dataset

Unnamed Dataset

  • Size: 429 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 429 samples:
    sentence_0 sentence_1 label
    type string string int
    details
    • min: 40 tokens
    • mean: 84.96 tokens
    • max: 131 tokens
    • min: 11 tokens
    • mean: 14.11 tokens
    • max: 20 tokens
    • 0: ~0.47%
    • 1: ~0.47%
    • 2: ~0.23%
    • 3: ~0.47%
    • 4: ~0.47%
    • 5: ~0.23%
    • 6: ~0.47%
    • 7: ~0.23%
    • 8: ~0.47%
    • 9: ~0.23%
    • 10: ~0.23%
    • 11: ~0.23%
    • 12: ~0.47%
    • 13: ~0.47%
    • 14: ~0.47%
    • 15: ~0.23%
    • 16: ~0.93%
    • 17: ~0.23%
    • 18: ~0.47%
    • 19: ~0.47%
    • 20: ~0.23%
    • 21: ~0.23%
    • 22: ~0.23%
    • 23: ~0.47%
    • 24: ~0.23%
    • 25: ~0.23%
    • 26: ~0.23%
    • 27: ~0.47%
    • 28: ~1.17%
    • 29: ~0.47%
    • 30: ~0.47%
    • 31: ~0.47%
    • 32: ~0.23%
    • 33: ~0.23%
    • 34: ~0.70%
    • 35: ~0.23%
    • 36: ~0.23%
    • 37: ~0.23%
    • 38: ~0.23%
    • 39: ~0.70%
    • 40: ~0.23%
    • 41: ~0.70%
    • 42: ~0.47%
    • 43: ~0.23%
    • 44: ~0.23%
    • 45: ~0.23%
    • 46: ~0.47%
    • 47: ~0.23%
    • 48: ~0.23%
    • 49: ~0.47%
    • 50: ~0.47%
    • 51: ~0.23%
    • 52: ~0.23%
    • 53: ~0.23%
    • 54: ~0.47%
    • 55: ~0.47%
    • 56: ~0.23%
    • 57: ~0.23%
    • 58: ~0.47%
    • 59: ~0.23%
    • 60: ~0.47%
    • 61: ~0.23%
    • 62: ~0.47%
    • 63: ~0.47%
    • 64: ~0.23%
    • 65: ~0.23%
    • 66: ~0.47%
    • 67: ~0.47%
    • 68: ~0.70%
    • 69: ~0.47%
    • 70: ~0.47%
    • 71: ~0.23%
    • 72: ~0.47%
    • 73: ~0.47%
    • 74: ~0.70%
    • 75: ~0.23%
    • 76: ~0.47%
    • 77: ~0.70%
    • 78: ~0.23%
    • 79: ~0.70%
    • 80: ~0.23%
    • 81: ~0.23%
    • 82: ~0.47%
    • 83: ~0.23%
    • 84: ~0.47%
    • 85: ~0.47%
    • 86: ~0.47%
    • 87: ~0.47%
    • 88: ~0.23%
    • 89: ~0.23%
    • 90: ~0.47%
    • 91: ~0.23%
    • 92: ~0.47%
    • 93: ~0.23%
    • 94: ~0.23%
    • 95: ~0.47%
    • 96: ~0.47%
    • 97: ~0.23%
    • 98: ~0.23%
    • 99: ~0.23%
    • 100: ~0.70%
    • 101: ~0.47%
    • 102: ~0.23%
    • 103: ~0.47%
    • 104: ~0.70%
    • 105: ~0.23%
    • 106: ~0.23%
    • 107: ~0.23%
    • 108: ~0.47%
    • 109: ~0.23%
    • 110: ~0.47%
    • 111: ~0.23%
    • 112: ~0.47%
    • 113: ~0.23%
    • 114: ~0.47%
    • 115: ~0.23%
    • 116: ~0.23%
    • 117: ~0.23%
    • 118: ~0.70%
    • 119: ~0.47%
    • 120: ~0.23%
    • 121: ~0.23%
    • 122: ~0.47%
    • 123: ~0.70%
    • 124: ~0.23%
    • 125: ~0.47%
    • 126: ~0.23%
    • 127: ~0.23%
    • 128: ~0.23%
    • 129: ~0.47%
    • 130: ~0.23%
    • 131: ~0.70%
    • 132: ~0.47%
    • 133: ~0.23%
    • 134: ~0.23%
    • 135: ~0.47%
    • 136: ~0.23%
    • 137: ~0.23%
    • 138: ~0.47%
    • 139: ~0.23%
    • 140: ~0.47%
    • 141: ~0.23%
    • 142: ~0.23%
    • 143: ~0.47%
    • 144: ~0.23%
    • 145: ~0.70%
    • 146: ~0.93%
    • 147: ~0.47%
    • 148: ~0.23%
    • 149: ~0.47%
    • 150: ~0.47%
    • 151: ~0.47%
    • 152: ~0.23%
    • 153: ~0.47%
    • 154: ~0.47%
    • 155: ~0.23%
    • 156: ~0.23%
    • 157: ~0.47%
    • 158: ~0.47%
    • 159: ~0.23%
    • 160: ~0.23%
    • 161: ~0.70%
    • 162: ~0.23%
    • 163: ~0.23%
    • 164: ~0.47%
    • 165: ~0.47%
    • 166: ~0.93%
    • 167: ~0.23%
    • 168: ~0.47%
    • 169: ~0.70%
    • 170: ~0.23%
    • 171: ~0.23%
    • 172: ~0.47%
    • 173: ~0.23%
    • 174: ~0.47%
    • 175: ~0.70%
    • 176: ~0.23%
    • 177: ~0.23%
    • 178: ~0.23%
    • 179: ~0.47%
    • 180: ~0.47%
    • 181: ~0.47%
    • 182: ~0.23%
    • 183: ~0.23%
    • 184: ~0.47%
    • 185: ~0.23%
    • 186: ~0.23%
    • 187: ~0.70%
    • 188: ~0.70%
    • 189: ~0.23%
    • 190: ~0.47%
    • 191: ~0.23%
    • 192: ~0.23%
    • 193: ~0.70%
    • 194: ~0.23%
    • 195: ~0.23%
    • 196: ~0.47%
    • 197: ~0.23%
    • 198: ~0.47%
    • 199: ~0.47%
    • 200: ~0.23%
    • 201: ~0.23%
    • 202: ~0.23%
    • 203: ~0.47%
    • 204: ~0.47%
    • 205: ~0.23%
    • 206: ~0.47%
    • 207: ~0.23%
    • 208: ~0.23%
    • 209: ~0.47%
    • 210: ~0.70%
    • 211: ~0.47%
    • 212: ~0.47%
    • 213: ~0.47%
    • 214: ~0.23%
    • 215: ~0.23%
    • 216: ~0.47%
    • 217: ~0.47%
    • 218: ~0.23%
    • 219: ~0.23%
    • 220: ~0.23%
    • 221: ~0.23%
    • 222: ~0.23%
    • 223: ~0.70%
    • 224: ~0.23%
    • 225: ~0.47%
    • 226: ~0.47%
    • 227: ~0.23%
    • 228: ~0.70%
    • 229: ~0.47%
    • 230: ~0.47%
    • 231: ~0.23%
    • 232: ~0.70%
    • 233: ~0.70%
    • 234: ~0.47%
    • 235: ~0.23%
    • 236: ~0.23%
    • 237: ~0.23%
    • 238: ~0.23%
    • 239: ~0.47%
    • 240: ~0.47%
    • 241: ~0.23%
    • 242: ~0.93%
    • 243: ~0.47%
    • 244: ~0.23%
    • 245: ~0.70%
    • 246: ~0.23%
    • 247: ~0.70%
    • 248: ~0.47%
    • 249: ~0.23%
    • 250: ~0.47%
    • 251: ~0.23%
    • 252: ~0.23%
    • 253: ~0.23%
    • 254: ~0.23%
    • 255: ~0.47%
    • 256: ~0.47%
    • 257: ~0.70%
    • 258: ~0.23%
    • 259: ~0.23%
  • Samples:
    sentence_0 sentence_1 label
    Word: h l a g ̲ o o k ̲| Context: I i h l a g ̲ o o k ̲ d i m h a ' w i ' y i i k y ' a a i s x w i ' y g ̲ o o h l w i l p x s e e k ̲ .| Translation: And before I went home I had a short pee in the bathroom. Morpheme: g ̲ o o k ̲ | Gloss: first 90
    Word: x s a ' a k ̲ x w i ' y| Context: H l a a x s a ' a k ̲ x w i ' y ' n i i g ̲ a y o o t s ' i m i l t ' a a h l i h l j a b i ' y g ̲ o o h l t s ' i m w i l p x s e e k ̲ .| Translation: When I made it out, then I put what I had done (the rubble) back in the bathroom. Morpheme: x s i | Gloss: out 228
    Word: n e e d i i| Context: I i ' n a k w h l ' w i h l w i l i ' m , g w i l a ' l h l g ̲ a n u u t x w , g ̲ a n w i h l n e e d i i l a x ̲ ' n i s x w i ' y g ̲ o o h l G i g e e n i x .| Translation: And we were away a long time, three weeks, and that's why I didn't hear from Gigeenix. Morpheme: n e e | Gloss: NEG 67
  • Loss: main.LossLogger

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • num_train_epochs: 1000
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1000
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss validation_cosine_ndcg@10
1.0 4 - 0.0647
2.0 8 - 0.0761
3.0 12 - 0.1236
4.0 16 - 0.2528
5.0 20 - 0.3902
6.0 24 - 0.4764
7.0 28 - 0.5325
8.0 32 - 0.6067
9.0 36 - 0.6709
10.0 40 - 0.7043
11.0 44 - 0.7018
12.0 48 - 0.6915
13.0 52 - 0.7073
14.0 56 - 0.7310
15.0 60 - 0.7335
16.0 64 - 0.7389
17.0 68 - 0.7586
18.0 72 - 0.7615
19.0 76 - 0.7586
20.0 80 - 0.7472
21.0 84 - 0.7588
22.0 88 - 0.7641
23.0 92 - 0.7740
24.0 96 - 0.7633
25.0 100 - 0.7721
26.0 104 - 0.7669
27.0 108 - 0.7728
28.0 112 - 0.7868
29.0 116 - 0.7735
30.0 120 - 0.7829
31.0 124 - 0.7937
32.0 128 - 0.7902
33.0 132 - 0.7656
34.0 136 - 0.7838
35.0 140 - 0.7821
36.0 144 - 0.7871
37.0 148 - 0.7869
38.0 152 - 0.7920
39.0 156 - 0.7905
40.0 160 - 0.7954
41.0 164 - 0.7966
42.0 168 - 0.7835
43.0 172 - 0.7800
44.0 176 - 0.8047
45.0 180 - 0.7990
46.0 184 - 0.7860
47.0 188 - 0.7891
48.0 192 - 0.7958
49.0 196 - 0.7813
50.0 200 - 0.7778
51.0 204 - 0.8001
52.0 208 - 0.7870
53.0 212 - 0.8027
54.0 216 - 0.7905
55.0 220 - 0.7827
56.0 224 - 0.8020
57.0 228 - 0.7919
58.0 232 - 0.7817
59.0 236 - 0.7994
60.0 240 - 0.8164
61.0 244 - 0.7788
62.0 248 - 0.7900
63.0 252 - 0.8173
64.0 256 - 0.7976
65.0 260 - 0.7905
66.0 264 - 0.7923
67.0 268 - 0.8071
68.0 272 - 0.7958
69.0 276 - 0.7871
70.0 280 - 0.8020
71.0 284 - 0.8103
72.0 288 - 0.8123
73.0 292 - 0.8118
74.0 296 - 0.7934
75.0 300 - 0.7882
76.0 304 - 0.8015
77.0 308 - 0.8201
78.0 312 - 0.8240
79.0 316 - 0.7994
80.0 320 - 0.8042
81.0 324 - 0.8114
82.0 328 - 0.8100
83.0 332 - 0.8041
84.0 336 - 0.8179
85.0 340 - 0.8197
86.0 344 - 0.7973
87.0 348 - 0.7985
88.0 352 - 0.8123
89.0 356 - 0.7997
90.0 360 - 0.8043
91.0 364 - 0.8057
92.0 368 - 0.7991
93.0 372 - 0.7983
94.0 376 - 0.8052
95.0 380 - 0.8026
96.0 384 - 0.8109
97.0 388 - 0.7929
98.0 392 - 0.8025
99.0 396 - 0.8218
100.0 400 - 0.8194
101.0 404 - 0.8023
102.0 408 - 0.8099
103.0 412 - 0.8110
104.0 416 - 0.8118
105.0 420 - 0.8004
106.0 424 - 0.8012
107.0 428 - 0.8070
108.0 432 - 0.8088
109.0 436 - 0.8073
110.0 440 - 0.8084
111.0 444 - 0.8038
112.0 448 - 0.8115
113.0 452 - 0.8169
114.0 456 - 0.8145
115.0 460 - 0.8020
116.0 464 - 0.7984
117.0 468 - 0.8077
118.0 472 - 0.8174
119.0 476 - 0.8200
120.0 480 - 0.8080
121.0 484 - 0.8093
122.0 488 - 0.8216
123.0 492 - 0.8240
124.0 496 - 0.8097
125.0 500 1.3195 0.8115
126.0 504 - 0.8176
127.0 508 - 0.8099
128.0 512 - 0.7977
129.0 516 - 0.7985
130.0 520 - 0.8015
131.0 524 - 0.8078
132.0 528 - 0.7985
133.0 532 - 0.8029
134.0 536 - 0.8087
135.0 540 - 0.8031
136.0 544 - 0.7999
137.0 548 - 0.8107
138.0 552 - 0.8110
139.0 556 - 0.7980
140.0 560 - 0.7977
141.0 564 - 0.8034
142.0 568 - 0.8053
143.0 572 - 0.7996
144.0 576 - 0.8014
145.0 580 - 0.8137
146.0 584 - 0.8221
147.0 588 - 0.8144
148.0 592 - 0.8020
149.0 596 - 0.7987
150.0 600 - 0.8029
151.0 604 - 0.8024
152.0 608 - 0.8045
153.0 612 - 0.8061
154.0 616 - 0.8035
155.0 620 - 0.8018
156.0 624 - 0.7992
157.0 628 - 0.8053
158.0 632 - 0.8134
159.0 636 - 0.8173
160.0 640 - 0.8118
161.0 644 - 0.8144
162.0 648 - 0.8145
163.0 652 - 0.8105
164.0 656 - 0.8011
165.0 660 - 0.8073
166.0 664 - 0.8111
167.0 668 - 0.8139
168.0 672 - 0.8030
169.0 676 - 0.8035
170.0 680 - 0.7993
171.0 684 - 0.8023
172.0 688 - 0.8081
173.0 692 - 0.8097
174.0 696 - 0.8060
175.0 700 - 0.8063
176.0 704 - 0.8114
177.0 708 - 0.8087
178.0 712 - 0.8090
179.0 716 - 0.8094
180.0 720 - 0.8071
181.0 724 - 0.8077
182.0 728 - 0.8108
183.0 732 - 0.8160
184.0 736 - 0.8099
185.0 740 - 0.8052
186.0 744 - 0.8103
187.0 748 - 0.8115
188.0 752 - 0.8105
189.0 756 - 0.8057
190.0 760 - 0.8157
191.0 764 - 0.8096
192.0 768 - 0.7998
193.0 772 - 0.8080
194.0 776 - 0.8207
195.0 780 - 0.8136
196.0 784 - 0.8029
197.0 788 - 0.8009
198.0 792 - 0.8150
199.0 796 - 0.8173
200.0 800 - 0.8070
201.0 804 - 0.8075
202.0 808 - 0.8164
203.0 812 - 0.8148
204.0 816 - 0.8077
205.0 820 - 0.8116
206.0 824 - 0.8148
207.0 828 - 0.8141
208.0 832 - 0.8085
209.0 836 - 0.8066
210.0 840 - 0.8154
211.0 844 - 0.8168
212.0 848 - 0.8132
213.0 852 - 0.8139
214.0 856 - 0.8200
215.0 860 - 0.8203
216.0 864 - 0.8100
217.0 868 - 0.8084
218.0 872 - 0.8115
219.0 876 - 0.8126
220.0 880 - 0.8126
221.0 884 - 0.8079
222.0 888 - 0.8101
223.0 892 - 0.8136
224.0 896 - 0.8124
225.0 900 - 0.8180
226.0 904 - 0.8173
227.0 908 - 0.8110
228.0 912 - 0.7991
229.0 916 - 0.8009
230.0 920 - 0.8096
231.0 924 - 0.8153
232.0 928 - 0.8177
233.0 932 - 0.8107
234.0 936 - 0.8066
235.0 940 - 0.8067
236.0 944 - 0.8198
237.0 948 - 0.8175
238.0 952 - 0.8077
239.0 956 - 0.8099
240.0 960 - 0.8073
241.0 964 - 0.8117
242.0 968 - 0.8148
243.0 972 - 0.8144
244.0 976 - 0.8050
245.0 980 - 0.8046
246.0 984 - 0.8107
247.0 988 - 0.8114
248.0 992 - 0.8065
249.0 996 - 0.8071
250.0 1000 1.081 0.8105
251.0 1004 - 0.8142
252.0 1008 - 0.8123
253.0 1012 - 0.8123
254.0 1016 - 0.8104
255.0 1020 - 0.8168
256.0 1024 - 0.8171
257.0 1028 - 0.8188
258.0 1032 - 0.8210
259.0 1036 - 0.8221
260.0 1040 - 0.8156
261.0 1044 - 0.8118
262.0 1048 - 0.8078
263.0 1052 - 0.8108
264.0 1056 - 0.8121
265.0 1060 - 0.8146
266.0 1064 - 0.8116
267.0 1068 - 0.8149
268.0 1072 - 0.8122
269.0 1076 - 0.8125
270.0 1080 - 0.8114
271.0 1084 - 0.8139
272.0 1088 - 0.8240
273.0 1092 - 0.8240
274.0 1096 - 0.8196
275.0 1100 - 0.8233
276.0 1104 - 0.8228
277.0 1108 - 0.8165
278.0 1112 - 0.8183
279.0 1116 - 0.8217
280.0 1120 - 0.8166
281.0 1124 - 0.8106
282.0 1128 - 0.8117
283.0 1132 - 0.8152
284.0 1136 - 0.8222
285.0 1140 - 0.8230
286.0 1144 - 0.8123
287.0 1148 - 0.8080
288.0 1152 - 0.8125
289.0 1156 - 0.8192
290.0 1160 - 0.8267
291.0 1164 - 0.8232
292.0 1168 - 0.8086
293.0 1172 - 0.8081
294.0 1176 - 0.8215
295.0 1180 - 0.8211
296.0 1184 - 0.8147
297.0 1188 - 0.8107
298.0 1192 - 0.8123
299.0 1196 - 0.8113
300.0 1200 - 0.8161
301.0 1204 - 0.8161
302.0 1208 - 0.8181
303.0 1212 - 0.8167
304.0 1216 - 0.8167
305.0 1220 - 0.8257
306.0 1224 - 0.8269

Framework Versions

  • Python: 3.11.4
  • Sentence Transformers: 5.1.1
  • Transformers: 4.56.2
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.1.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
5
Safetensors
Model size
0.5B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CMU-Wav2Gloss/Gitksan-encoder-bsz128-e1k-bsz32-e1k

Finetuned
(85)
this model

Paper for CMU-Wav2Gloss/Gitksan-encoder-bsz128-e1k-bsz32-e1k

Evaluation results