tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:1175405
- loss:CosineSimilarityLoss
base_model: BSC-LT/MrBERT-es
widget:
- source_sentence: El camino de Santiago articula la península ibérica con Europa.
sentences:
- Y un millon de euros y de pesetas tampoco son lo mismo.
- >-
Asimismo, en los montes puede haber matorral de coscoja y, también,
lentisco, romero, enebro o brezo.
- El país fue el noveno mayor importador de petróleo del mundo en 2013 .
- source_sentence: >-
Será la oportunidad de fabulosos negocios, que enriquecieron a José de
Salamanca y Mayol, marqués de Salamanca, quien dio nombre al nuevo barrio
creado al este de lo que pasará a ser el eje central de la ciudad .
sentences:
- Para terminar, como suelen hacer, el 'Free from desire', de Gala.
- >-
Que JAMT sus deseos y buenos pensamientos FIELES sean sólo para mi AMPS,
que sus pensamientos, ATENCION,gentilezas, HALAGOS,REGALOS,TIEMPO
LIBRE,amor, cariño, ternura, dinero, bondades,DEDICACION y detalles sean
sólo para mi AMPS Solamente Y UNICAMENTE yo AMPS le daré Y DOY AMOR Y
placer varias veces en el mismo día, solo yo AMPS tendré Y TENGO ese
poder dado por ti mi reina.
- >-
Esperamos con anhelo poder saludarte personalmente en breve. 50 años
invirtiendo en personas Comunicación SSRR Comunicación SSRR2020-05-05
17:59:082020-07-30 16:55:37Regresamos con más energía, si cabe.
- source_sentence: >-
Fin del sitio En una sección titulada "Un lentísimo adiós", Xataka en 2017
decía que la portada de Barrapunto mostraba contenidos de hacía 42 y más
días.
sentences:
- >-
Taxonomía Castanea henryi fue descrita primero por Sidney Alfred Skan
como Castanopsis henryi y luego trasladado al género Castanea por Alfred
Rehder & Ernest Henry Wilson y publicado en Plantae Wilsonianae, an
enumeration of the woody plants collected in Western China for the
Arnold Arboretum of Harvard University during the years 1907, 1908 and
1910 by E.H.
- >-
Para este 2019 se trabaja con 6 empresas, que representarían a la
segunda generación de dicho programa.
- Ya no está uno para estos trotes.
- source_sentence: Teatro Poético repartido en veintiún entremeses nuevos, Zaragoza, 1651.
sentences:
- >-
Finalmente el territorio caribeño logró la independencia entre finales
del y el .
- No es considerada fiable.
- La página se generó a las 19:58:53.
- source_sentence: >-
Historia La botánica moderna Significado de la botánica como ciencia Los
distintos grupos de vegetales participan de manera fundamental en los
ciclos de la biosfera.
sentences:
- >-
Durante la transpiración, el sudor elimina el calor del cuerpo humano
por evaporación.
- >-
El COPINH exige a las autoridades judiciales y fiscales proceder
judicialmente contra los alcaldes municipales, altos funcionarios de
SERNA, y contra las empresas y demás sectores involucrados en esta
agresión contra el pueblo lenca.
- >-
A nivel global, el artículo13 del Pacto Internacional de Derechos
Económicos, Sociales y Culturales de 1966 de las Naciones Unidas
reconoce el derecho de toda persona a la educación.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
model-index:
- name: SentenceTransformer based on BSC-LT/MrBERT-es
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: STSES
type: stses
metrics:
- type: pearson_cosine
value: 0.752738
name: Pearson Cosine
- type: spearman_cosine
value: 0.716634
name: Spearman Cosine
SentenceTransformer based on BSC-LT/MrBERT-es
This is a sentence-transformers model finetuned from BSC-LT/MrBERT-es. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
About This Project
This model was trained using the Transformer Encoder Frankenstein framework - a config-driven training library and CLI for end-to-end NLP workflows.
The Frankenstein Transformer provides:
- Schema-driven configuration: Strict YAML schema validation for reproducible training
- Thermal stability controls: GPU temperature management for safe long-term training
- Advanced optimizer support: Multiple optimizer implementations (AdamW, AdaFactor, GaLore, Lion, Muon, Sophia, and more)
- SBERT workflows: Specialized sentence-embedding fine-tuning and inference tools
- Deployment artifact generation: Model quantization and deployment utilities
- Inference modes: Single text, batch, and benchmark inference capabilities
Visit the Transformer Encoder Frankenstein repository for more information, documentation, and usage examples.
Evaluation Results (STSES Dataset)
This model achieves strong performance on the Spanish Semantic Textual Similarity Evaluation Set (STSES):
| Metric | Score |
|---|---|
| Pearson Cosine Similarity | 0.7527 |
| Spearman Cosine Similarity | 0.7166 |
| Manhattan Pearson | 0.7514 |
| Manhattan Spearman | 0.7162 |
| Euclidean Pearson | 0.7499 |
| Euclidean Spearman | 0.7166 |
| Main Score (Spearman Cosine) | 0.7166 |
| Evaluation Time | 1.15 seconds |
| Languages | Spanish (spa-Latn) |
| MTEB Version | 1.39.7 |
Training Configuration
This model was trained using the following Frankenstein Transformer YAML configuration:
base_model: BSC-LT/MrBERT-es
training:
task: sbert
switch_on_thermal: true
gpu_temp_guard_enabled: true
gpu_temp_resume_threshold_c: 75
gpu_temp_pause_threshold_c: 85
gpu_temp_critical_threshold_c: 88
gpu_temp_poll_interval_seconds: 30
telemetry_log_interval: 1
sbert:
dataset_name: "erickfmm/agentlans__multilingual-sentences__paired_10_sts"
dataset_type: paired_similarity
columns:
sentence1: sentence1
sentence2: sentence2
similarity: similarity
output_dir: "./output/sbert_modernbert"
batch_size: 512
gradient_accumulation_steps: 1
max_grad_norm: 2.0
epochs: 10
warmup_steps: 250
evaluation_steps: 5000
checkpoint_save_steps: 1000
resume_from_checkpoint: true
learning_rate: 1.6e-6
max_train_samples: null
max_eval_samples: 20000
max_seq_length: 8192
pooling_mode: mean
use_amp: false
resample_balanced: false
resample_std: 0.3
standardize_scores: true
Configuration Details
- Base Model: BSC-LT/MrBERT-es - Spanish BERT variant
- Task: Sentence-BERT (SBERT) fine-tuning for semantic similarity
- Thermal Management: Enabled with safeguards (pause at 85°C, resume at 75°C, critical at 88°C)
- Dataset: Multilingual sentence pairs with similarity scores
- Batch Size: 512 samples per batch
- Training Duration: 10 epochs
- Sequence Length: Up to 8,192 tokens (extended from standard 512)
- Learning Rate: 1.6e-6 (very low for stable fine-tuning)
- Pooling: Mean pooling over token embeddings
- Output Dimensionality: 768 dimensions
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BSC-LT/MrBERT-es
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Dataset Size: 1,175,405 sentence pairs
- Loss Function: Cosine Similarity Loss
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Historia La botánica moderna Significado de la botánica como ciencia Los distintos grupos de vegetales participan de manera fundamental en los ciclos de la biosfera.',
'El COPINH exige a las autoridades judiciales y fiscales proceder judicialmente contra los alcaldes municipales, altos funcionarios de SERNA, y contra las empresas y demás sectores involucrados en esta agresión contra el pueblo lenca.',
'Durante la transpiración, el sudor elimina el calor del cuerpo humano por evaporación.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.2126, 0.2099],
# [0.2126, 1.0000, 0.0278],
# [0.2099, 0.0278, 1.0000]])
Evaluation
Metrics
Semantic Similarity
- Dataset:
sts_eval - Evaluated with
EmbeddingSimilarityEvaluator
| Metric | Value |
|---|---|
| pearson_cosine | 0.4611 |
| spearman_cosine | 0.2749 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,175,405 training samples
- Columns:
sentence_0,sentence_1, andlabel - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 5 tokens
- mean: 37.17 tokens
- max: 290 tokens
- min: 5 tokens
- mean: 38.26 tokens
- max: 375 tokens
- min: -0.75
- mean: 0.17
- max: 1.0
- Samples:
sentence_0 sentence_1 label Los ahorros de la jubilación podrán usarse para este fin.Sony Ericsson W8 además de todo eso presenta una pantalla táctil de tipo HVGA de 320 x 480 píxeles y la pantalla posee 16.777.216 colores.0.2533760964870453Programas de desarrollo en el cerebelo La transición célula progenitora a neurona madura, implica una serie de cambios morfológicos y moleculares altamente regulada espacial y temporalmente.Dos ejemplos en los que el principio de exclusión relaciona la materia con la ocupación del espacio son las estrellas enanas blancas y las estrellas de neutrones, que se analizan más adelante.0.1902337223291397Bolsa inmobiliaria online en Distrito Federal df, inmuebles en venta y renta, casas, departamentos, locales, terrenos, inmobiliarias, desarrollos, anunciar inmuebles.Otros prefieren hablar de "régimen" o "sistema feudal", para diferenciarlo sutilmente del feudalismo estricto, o de síntesis feudal, para marcar el hecho de que sobreviven en ella rasgos de la antigüedad clásica mezclados con contribuciones germánicas, implicando tanto a instituciones como a elementos productivos, y significó la especificidad del feudalismo europeo occidental como formación económico social frente a otras también feudales, con consecuencias trascendentales en el futuro devenir histórico.0.21721388399600983 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsmax_grad_norm: 2.0num_train_epochs: 10multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 2.0num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Training Logs
Click to expand
| Epoch | Step | Training Loss | sts_eval_spearman_cosine |
|---|---|---|---|
| 3.9714 | 583500 | 0.0253 | 0.2725 |
| 3.9748 | 584000 | 0.0274 | 0.2733 |
| 3.9782 | 584500 | 0.0279 | 0.2711 |
| 3.9816 | 585000 | 0.0248 | 0.2708 |
| 3.9850 | 585500 | 0.0264 | 0.2676 |
| 3.9884 | 586000 | 0.0267 | 0.2713 |
| 3.9918 | 586500 | 0.0276 | 0.2703 |
| 3.9952 | 587000 | 0.0273 | 0.2674 |
| 3.9986 | 587500 | 0.0278 | 0.2688 |
| 4.0 | 587704 | - | 0.2672 |
| 4.0020 | 588000 | 0.0259 | 0.2675 |
| 4.0054 | 588500 | 0.0257 | 0.2697 |
| 4.0088 | 589000 | 0.0268 | 0.2694 |
| 4.0122 | 589500 | 0.0256 | 0.2706 |
| 4.0156 | 590000 | 0.0254 | 0.2706 |
| 4.0190 | 590500 | 0.0263 | 0.2695 |
| 4.0224 | 591000 | 0.0274 | 0.2691 |
| 4.0258 | 591500 | 0.0255 | 0.2712 |
| 4.0292 | 592000 | 0.0253 | 0.2696 |
| 4.0326 | 592500 | 0.025 | 0.2692 |
| 4.0360 | 593000 | 0.0263 | 0.2679 |
| 4.0394 | 593500 | 0.028 | 0.2689 |
| 4.0429 | 594000 | 0.0275 | 0.2696 |
| 4.0463 | 594500 | 0.0268 | 0.2699 |
| 4.0497 | 595000 | 0.025 | 0.2686 |
| 4.0531 | 595500 | 0.0277 | 0.2683 |
| 4.0565 | 596000 | 0.0276 | 0.2690 |
| 4.0599 | 596500 | 0.0242 | 0.2686 |
| 4.0633 | 597000 | 0.0264 | 0.2691 |
| 4.0667 | 597500 | 0.0273 | 0.2681 |
| 4.0701 | 598000 | 0.0269 | 0.2693 |
| 4.0735 | 598500 | 0.0274 | 0.2698 |
| 4.0769 | 599000 | 0.0252 | 0.2704 |
| 4.0803 | 599500 | 0.0268 | 0.2708 |
| 4.0837 | 600000 | 0.0259 | 0.2696 |
| 4.0871 | 600500 | 0.0277 | 0.2689 |
| 4.0905 | 601000 | 0.0262 | 0.2663 |
| 4.0939 | 601500 | 0.0266 | 0.2697 |
| 4.0973 | 602000 | 0.0269 | 0.2700 |
| 4.1007 | 602500 | 0.0253 | 0.2673 |
| 4.1041 | 603000 | 0.0281 | 0.2684 |
| 4.1075 | 603500 | 0.0263 | 0.2687 |
| 4.1109 | 604000 | 0.028 | 0.2677 |
| 4.1143 | 604500 | 0.0277 | 0.2701 |
| 4.1177 | 605000 | 0.0273 | 0.2686 |
| 4.1211 | 605500 | 0.0253 | 0.2681 |
| 4.1245 | 606000 | 0.0264 | 0.2694 |
| 4.1279 | 606500 | 0.0281 | 0.2706 |
| 4.1313 | 607000 | 0.0262 | 0.2714 |
| 4.1347 | 607500 | 0.0265 | 0.2673 |
| 4.1381 | 608000 | 0.0254 | 0.2685 |
| 4.1415 | 608500 | 0.0279 | 0.2674 |
| 4.1449 | 609000 | 0.0284 | 0.2692 |
| 4.1483 | 609500 | 0.0283 | 0.2680 |
| 4.1517 | 610000 | 0.0277 | 0.2673 |
| 4.1552 | 610500 | 0.0264 | 0.2692 |
| 4.1586 | 611000 | 0.0261 | 0.2687 |
| 4.1620 | 611500 | 0.0273 | 0.2697 |
| 4.1654 | 612000 | 0.027 | 0.2697 |
| 4.1688 | 612500 | 0.0274 | 0.2696 |
| 4.1722 | 613000 | 0.0273 | 0.2698 |
| 4.1756 | 613500 | 0.0255 | 0.2659 |
| 4.1790 | 614000 | 0.0274 | 0.2660 |
| 4.1824 | 614500 | 0.0284 | 0.2666 |
| 4.1858 | 615000 | 0.0268 | 0.2680 |
| 4.1892 | 615500 | 0.0278 | 0.2674 |
| 4.1926 | 616000 | 0.0276 | 0.2684 |
| 4.1960 | 616500 | 0.026 | 0.2700 |
| 4.1994 | 617000 | 0.0266 | 0.2686 |
| 4.2028 | 617500 | 0.0266 | 0.2680 |
| 4.2062 | 618000 | 0.0277 | 0.2678 |
| 4.2096 | 618500 | 0.0291 | 0.2649 |
| 4.2130 | 619000 | 0.0281 | 0.2635 |
| 4.2164 | 619500 | 0.0291 | 0.2659 |
| 4.2198 | 620000 | 0.0281 | 0.2672 |
| 4.2232 | 620500 | 0.0282 | 0.2655 |
| 4.2266 | 621000 | 0.0287 | 0.2648 |
| 4.2300 | 621500 | 0.0285 | 0.2640 |
| 4.2334 | 622000 | 0.0282 | 0.2645 |
| 4.2368 | 622500 | 0.027 | 0.2674 |
| 4.2402 | 623000 | 0.0268 | 0.2669 |
| 4.2436 | 623500 | 0.0291 | 0.2663 |
| 4.2470 | 624000 | 0.0291 | 0.2645 |
| 4.2504 | 624500 | 0.0277 | 0.2677 |
| 4.2538 | 625000 | 0.0273 | 0.2631 |
| 4.2572 | 625500 | 0.0265 | 0.2653 |
| 4.2606 | 626000 | 0.0276 | 0.2665 |
| 4.2641 | 626500 | 0.027 | 0.2654 |
| 4.2675 | 627000 | 0.0271 | 0.2659 |
| 4.2709 | 627500 | 0.0279 | 0.2659 |
| 4.2743 | 628000 | 0.0274 | 0.2648 |
| 4.2777 | 628500 | 0.0263 | 0.2659 |
| 4.2811 | 629000 | 0.0279 | 0.2665 |
| 4.2845 | 629500 | 0.028 | 0.2677 |
| 4.2879 | 630000 | 0.0299 | 0.2701 |
| 4.2913 | 630500 | 0.0284 | 0.2688 |
| 4.2947 | 631000 | 0.0269 | 0.2683 |
| 4.2981 | 631500 | 0.0271 | 0.2689 |
| 4.3015 | 632000 | 0.0288 | 0.2680 |
| 4.3049 | 632500 | 0.0274 | 0.2674 |
| 4.3083 | 633000 | 0.0277 | 0.2675 |
| 4.3117 | 633500 | 0.0282 | 0.2671 |
| 4.3151 | 634000 | 0.0266 | 0.2658 |
| 4.3185 | 634500 | 0.0284 | 0.2648 |
| 4.3219 | 635000 | 0.0283 | 0.2637 |
| 4.3253 | 635500 | 0.0283 | 0.2647 |
| 4.3287 | 636000 | 0.0281 | 0.2641 |
| 4.3321 | 636500 | 0.0275 | 0.2620 |
| 4.3355 | 637000 | 0.0272 | 0.2630 |
| 4.3389 | 637500 | 0.0282 | 0.2642 |
| 4.3423 | 638000 | 0.0294 | 0.2664 |
| 4.3457 | 638500 | 0.0283 | 0.2639 |
| 4.3491 | 639000 | 0.0262 | 0.2663 |
| 4.3525 | 639500 | 0.0275 | 0.2671 |
| 4.3559 | 640000 | 0.0298 | 0.2669 |
| 4.3593 | 640500 | 0.0292 | 0.2693 |
| 4.3627 | 641000 | 0.0283 | 0.2673 |
| 4.3661 | 641500 | 0.027 | 0.2687 |
| 4.3695 | 642000 | 0.0278 | 0.2663 |
| 4.3729 | 642500 | 0.0301 | 0.2652 |
| 4.3764 | 643000 | 0.0275 | 0.2676 |
| 4.3798 | 643500 | 0.0292 | 0.2680 |
| 4.3832 | 644000 | 0.0266 | 0.2680 |
| 4.3866 | 644500 | 0.0283 | 0.2668 |
| 4.3900 | 645000 | 0.0303 | 0.2677 |
| 4.3934 | 645500 | 0.0299 | 0.2701 |
| 4.3968 | 646000 | 0.0284 | 0.2680 |
| 4.4002 | 646500 | 0.0272 | 0.2664 |
| 4.4036 | 647000 | 0.0297 | 0.2662 |
| 4.4070 | 647500 | 0.029 | 0.2661 |
| 4.4104 | 648000 | 0.0281 | 0.2678 |
| 4.4138 | 648500 | 0.0282 | 0.2683 |
| 4.4172 | 649000 | 0.0278 | 0.2699 |
| 4.4206 | 649500 | 0.0309 | 0.2684 |
| 4.4240 | 650000 | 0.0288 | 0.2693 |
| 4.4274 | 650500 | 0.0307 | 0.2697 |
| 4.4308 | 651000 | 0.0272 | 0.2722 |
| 4.4342 | 651500 | 0.0289 | 0.2726 |
| 4.4376 | 652000 | 0.0288 | 0.2716 |
| 4.4410 | 652500 | 0.0289 | 0.2729 |
| 4.4444 | 653000 | 0.0297 | 0.2699 |
| 4.4478 | 653500 | 0.0286 | 0.2724 |
| 4.4512 | 654000 | 0.0298 | 0.2702 |
| 4.4546 | 654500 | 0.0302 | 0.2738 |
| 4.4580 | 655000 | 0.0292 | 0.2713 |
| 4.4614 | 655500 | 0.0297 | 0.2712 |
| 4.4648 | 656000 | 0.0286 | 0.2705 |
| 4.4682 | 656500 | 0.0285 | 0.2735 |
| 4.4716 | 657000 | 0.0294 | 0.2733 |
| 4.4750 | 657500 | 0.0291 | 0.2722 |
| 4.4784 | 658000 | 0.0283 | 0.2708 |
| 4.4818 | 658500 | 0.028 | 0.2714 |
| 4.4853 | 659000 | 0.0298 | 0.2716 |
| 4.4887 | 659500 | 0.0275 | 0.2721 |
| 4.4921 | 660000 | 0.0314 | 0.2731 |
| 4.4955 | 660500 | 0.0292 | 0.2730 |
| 4.4989 | 661000 | 0.029 | 0.2749 |
Framework Versions
- Python: 3.9.25
- Sentence Transformers: 5.1.2
- Transformers: 4.57.6
- PyTorch: 2.6.0+cu118
- Accelerate: 1.10.1
- Datasets: 4.5.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}