Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 on the json dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("GbrlOl/finetune-embedding-all-MiniLM-L6-v2-geotechnical-test-v4")
# Run inference
sentences = [
'¿Cuál es el factor de seguridad mínimo para el corto plazo en caso de falla superficial estática en el botadero Sur?',
'Plan de Cierre - Faena Minera Salares Norte | 95 \n \nTabla 8-13: Criterios para el Análisis de Estabilidad del Botadero Sur \nCondición FS Mínimo \nCorto Plazo \n(operacional) \nFalla Superficial Estático 1,0 \nSísmico (1) \nFalla Profunda Estático 1,5 \nSísmico 1,2 \nLargo Plazo \n(post-cierre) \nFalla Superficial Estático 1,1 \nSísmico (1) \nFalla Profunda Estático 1,5 \nSísmico 1,1 \n(1): El material es depositado me diante volteo de camiones y queda con su ángulo de reposo. Las fallas \nsuperficiales pueden ocurrir, pero las bermas de seguridad evitarán mayores deslizamientos de material. \nPara los análisis que involucren al depósito de relaves filtrados, ya sea por si solo o junto al botadero Sur, el factor \nde seguridad mínimo para el corto plazo es de 1,5 para casos estáticos y 1,2 para la condición sísmica. Para el largo \nplazo, en tanto, el factor de seguridad mínimo para la condición sísmica es de 1,1. \nLos factores de seguridad obtenidos de los análisis de estabilidad son presentados en la Tabla 8-14 y en la Tabla 8-15. \nTodos los análisis indican que; tanto el diseño del botadero Sur, como el diseño del depósito de relaves filtrados, por \nsí solos como en conjunto, cumplen con los diseños de criterios d e los factores de seguridad. \nLos análisis de fallas profundas han incorporado la determinación del factor de seguridad mínimo para fallas que \nimplican la totalidad del depósito, así como fallas que involucran 2 o 3 bancos, que pueden ser más críticos que \naquellos que involucran la totalidad del depósito.',
'Sin perjuicio de ello, en este \nplan de cierre temporal se ha hecho un análisis a nive l de juicio experto respecto de los riesgos \nque se indican en la siguiente tabla. \nTabla 3-3: Riesgos evaluados Instalaciones Complementarias y Auxiliares. \nInstalación Riesgos evaluados \nInstalaciones \nComplementarias \ny Auxiliares \nIA.1) Caída de Personas o animales a desnivel \nIA.2) Caída de objetos o materiales sobre personas o animales \nIA.3) Afectación a la salud de las personas por estructuras, \nmateriales y/o suelos contaminados \nFuente: Elaborado por MYMA, 2019 \n3.1 Evaluación de Riesgos \na) Evaluación de Riesgos previo a la definición de las medidas de cierre \nUna vez establecida la probabilidad de ocurrencia de los eventos y la severidad de las \nconsecuencias para las personas y el medio ambiente, se debe catalogar el límite de aceptabilidad \ndel riesgo.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sts_devEmbeddingSimilarityEvaluator| Metric | Value |
|---|---|
| pearson_cosine | 0.5694 |
| spearman_cosine | 0.5456 |
| pearson_euclidean | 0.574 |
| spearman_euclidean | 0.5456 |
| pearson_manhattan | 0.5797 |
| spearman_manhattan | 0.5534 |
| pearson_dot | 0.5694 |
| spearman_dot | 0.5456 |
| pearson_max | 0.5797 |
| spearman_max | 0.5534 |
quora_duplicates_devBinaryClassificationEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.7938 |
| cosine_accuracy_threshold | 0.5779 |
| cosine_f1 | 0.696 |
| cosine_f1_threshold | 0.5187 |
| cosine_precision | 0.7016 |
| cosine_recall | 0.6905 |
| cosine_ap | 0.807 |
| euclidean_accuracy | 0.6154 |
| euclidean_accuracy_threshold | -1.2038 |
| euclidean_f1 | 0.5556 |
| euclidean_f1_threshold | -0.5825 |
| euclidean_precision | 0.3858 |
| euclidean_recall | 0.9921 |
| euclidean_ap | 0.2644 |
| manhattan_accuracy | 0.6154 |
| manhattan_accuracy_threshold | -18.6887 |
| manhattan_f1 | 0.5556 |
| manhattan_f1_threshold | -9.1288 |
| manhattan_precision | 0.3858 |
| manhattan_recall | 0.9921 |
| manhattan_ap | 0.2632 |
| dot_accuracy | 0.7938 |
| dot_accuracy_threshold | 0.5779 |
| dot_f1 | 0.696 |
| dot_f1_threshold | 0.5187 |
| dot_precision | 0.7016 |
| dot_recall | 0.6905 |
| dot_ap | 0.807 |
| max_accuracy | 0.7938 |
| max_accuracy_threshold | 0.5779 |
| max_f1 | 0.696 |
| max_f1_threshold | 0.5187 |
| max_precision | 0.7016 |
| max_recall | 0.9921 |
| max_ap | 0.807 |
query, sentence, and label| query | sentence | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
| query | sentence | label |
|---|---|---|
Indica si se utiliza Proctor Modificado, o Normal o Estándar para compactar el relave filtrado, y cuál es el nivel de compactación |
PLAN DE CIERRE TEMPO RAL – FAENA MINERA EL TOQUI |
0 |
¿Cuál es la ubicación del Pozo Monitoreos? |
64 |
1 |
se especifican antecedentes geofísicos? |
Hay numerosas comunidades edáficas, una |
0 |
CoSENTLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "pairwise_cos_sim"
}
per_device_train_batch_size: 16per_device_eval_batch_size: 16learning_rate: 2e-05num_train_epochs: 100warmup_ratio: 0.1fp16: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 100max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | sts_dev_spearman_max | quora_duplicates_dev_max_ap |
|---|---|---|---|---|
| 0 | 0 | - | 0.5534 | 0.8070 |
| 2.3902 | 100 | 4.6587 | - | - |
| 4.7805 | 200 | 2.3234 | - | - |
| 7.1463 | 300 | 0.869 | - | - |
| 9.5366 | 400 | 0.2738 | - | - |
| 11.9268 | 500 | 0.328 | - | - |
| 14.2927 | 600 | 0.1296 | - | - |
| 16.6829 | 700 | 0.1233 | - | - |
| 19.0488 | 800 | 0.1024 | - | - |
| 21.4390 | 900 | 0.0337 | - | - |
| 23.8293 | 1000 | 0.0033 | - | - |
| 26.1951 | 1100 | 0.0508 | - | - |
| 28.5854 | 1200 | 0.0221 | - | - |
| 30.9756 | 1300 | 0.0167 | - | - |
| 33.3415 | 1400 | 0.0003 | - | - |
| 35.7317 | 1500 | 0.0 | - | - |
| 38.0976 | 1600 | 0.0 | - | - |
| 40.4878 | 1700 | 0.0 | - | - |
| 42.8780 | 1800 | 0.0 | - | - |
| 45.2439 | 1900 | 0.0 | - | - |
| 47.6341 | 2000 | 0.0 | - | - |
| 50.0244 | 2100 | 0.0 | - | - |
| 52.3902 | 2200 | 0.0 | - | - |
| 54.7805 | 2300 | 0.0 | - | - |
| 57.1463 | 2400 | 0.0 | - | - |
| 59.5366 | 2500 | 0.0 | - | - |
| 61.9268 | 2600 | 0.0 | - | - |
| 64.2927 | 2700 | 0.0 | - | - |
| 66.6829 | 2800 | 0.0 | - | - |
| 69.0488 | 2900 | 0.0 | - | - |
| 71.4390 | 3000 | 0.0 | - | - |
| 73.8293 | 3100 | 0.0 | - | - |
| 76.1951 | 3200 | 0.0 | - | - |
| 78.5854 | 3300 | 0.0 | - | - |
| 80.9756 | 3400 | 0.0 | - | - |
| 83.3415 | 3500 | 0.0 | - | - |
| 85.7317 | 3600 | 0.0 | - | - |
| 88.0976 | 3700 | 0.0 | - | - |
| 90.4878 | 3800 | 0.0 | - | - |
| 92.8780 | 3900 | 0.0 | - | - |
| 95.2439 | 4000 | 0.0 | - | - |
| 97.6341 | 4100 | 0.0 | - | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@online{kexuefm-8847,
title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
author={Su Jianlin},
year={2022},
month={Jan},
url={https://kexue.fm/archives/8847},
}
Base model
sentence-transformers/all-MiniLM-L6-v2