Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model trained. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("pankajrajdeo/BioForge-bioformer-16L-umls-integration")
# Run inference
sentences = [
'Congenital fibrinogen abnormality',
'Congenital disease',
'An application of magnetic resonance imaging that uses spin refocusing and spin echo generation, resulting in shorter repetition times and faster imaging.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
umls_sota_evalInformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.9579 |
| cosine_accuracy@3 | 0.9792 |
| cosine_accuracy@5 | 0.9829 |
| cosine_accuracy@10 | 0.9886 |
| cosine_precision@1 | 0.9579 |
| cosine_precision@3 | 0.5356 |
| cosine_precision@5 | 0.3658 |
| cosine_precision@10 | 0.206 |
| cosine_recall@1 | 0.6671 |
| cosine_recall@3 | 0.8833 |
| cosine_recall@5 | 0.9236 |
| cosine_recall@10 | 0.9553 |
| cosine_ndcg@10 | 0.9525 |
| cosine_mrr@10 | 0.9692 |
| cosine_map@100 | 0.9383 |
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
Cranial nerve structure |
Cranial neuropathy due to petrous infection |
Phenylalanine racemase (ATP-hydrolysing) |
Phenylalanine racemase (adenosine triphosphate-hydrolysing) (substance) |
Denibulin Hydrochloride |
The hydrochloride salt of denibulin, a small molecular vascular disrupting agent, with potential antimitotic and antineoplastic activities. Denibulin selectively targets and reversibly binds to the colchicine-binding site on tubulin and inhibits microtubule assembly. This results in the disruption of the cytoskeleton of tumor endothelial cells, ultimately leading to cell cycle arrest, blockage of cell division and apoptosis. This causes inadequate blood flow to the tumor and eventually leads to a decrease in tumor cell proliferation., a small molecule vascular disrupting agent (VDA), with potential antimitotic and antineoplastic activity. Denibulin selectively targets and reversibly binds to the colchicine-binding site on tubulin and inhibits microtubule assembly. This results in the disruption of the cytoskeleton of tumor endothelial cells (EC), ultimately leading to cell cycle arrest, blockage of cell division and apoptosis. This causes inadequate blood flow to the tumor and eventual... |
main.MultipleNegativesSymmetricMarginLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 512gradient_accumulation_steps: 4learning_rate: 1.5e-05num_train_epochs: 4lr_scheduler_type: cosinewarmup_ratio: 0.05bf16: Truedataloader_num_workers: 16load_best_model_at_end: Truegradient_checkpointing: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 512per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 4eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 1.5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.05warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 16dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Truegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | umls_sota_eval_cosine_ndcg@10 |
|---|---|---|---|
| 0.0695 | 100 | 0.8266 | - |
| 0.1390 | 200 | 0.5384 | - |
| 0.2086 | 300 | 0.4742 | - |
| 0.2781 | 400 | 0.4355 | - |
| 0.3295 | 474 | - | 0.9295 |
| 0.3476 | 500 | 0.4137 | - |
| 0.4171 | 600 | 0.3961 | - |
| 0.4866 | 700 | 0.3817 | - |
| 0.5561 | 800 | 0.3739 | - |
| 0.6257 | 900 | 0.3564 | - |
| 0.6590 | 948 | - | 0.9384 |
| 0.6952 | 1000 | 0.3587 | - |
| 0.7647 | 1100 | 0.3525 | - |
| 0.8342 | 1200 | 0.3463 | - |
| 0.9037 | 1300 | 0.3395 | - |
| 0.9732 | 1400 | 0.3329 | - |
| 0.9885 | 1422 | - | 0.9434 |
| 1.0424 | 1500 | 0.3228 | - |
| 1.1119 | 1600 | 0.318 | - |
| 1.1814 | 1700 | 0.3141 | - |
| 1.2510 | 1800 | 0.3101 | - |
| 1.3177 | 1896 | - | 0.9463 |
| 1.3205 | 1900 | 0.3134 | - |
| 1.3900 | 2000 | 0.3097 | - |
| 1.4595 | 2100 | 0.3006 | - |
| 1.5290 | 2200 | 0.303 | - |
| 1.5985 | 2300 | 0.3003 | - |
| 1.6472 | 2370 | - | 0.9484 |
| 1.6681 | 2400 | 0.2949 | - |
| 1.7376 | 2500 | 0.2951 | - |
| 1.8071 | 2600 | 0.2939 | - |
| 1.8766 | 2700 | 0.2908 | - |
| 1.9461 | 2800 | 0.2912 | - |
| 1.9767 | 2844 | - | 0.9502 |
| 2.0153 | 2900 | 0.2869 | - |
| 2.0848 | 3000 | 0.2807 | - |
| 2.1543 | 3100 | 0.2771 | - |
| 2.2238 | 3200 | 0.2795 | - |
| 2.2934 | 3300 | 0.2756 | - |
| 2.3059 | 3318 | - | 0.9510 |
| 2.3629 | 3400 | 0.2758 | - |
| 2.4324 | 3500 | 0.2765 | - |
| 2.5019 | 3600 | 0.2752 | - |
| 2.5714 | 3700 | 0.2745 | - |
| 2.6354 | 3792 | - | 0.9515 |
| 2.6409 | 3800 | 0.2714 | - |
| 2.7105 | 3900 | 0.2732 | - |
| 2.7800 | 4000 | 0.2735 | - |
| 2.8495 | 4100 | 0.2722 | - |
| 2.9190 | 4200 | 0.2713 | - |
| 2.9649 | 4266 | - | 0.9520 |
| 2.9885 | 4300 | 0.2721 | - |
| 3.0577 | 4400 | 0.2662 | - |
| 3.1272 | 4500 | 0.2654 | - |
| 3.1967 | 4600 | 0.2683 | - |
| 3.2662 | 4700 | 0.2687 | - |
| 3.2941 | 4740 | - | 0.9523 |
| 3.3358 | 4800 | 0.2665 | - |
| 3.4053 | 4900 | 0.2686 | - |
| 3.4748 | 5000 | 0.2612 | - |
| 3.5443 | 5100 | 0.263 | - |
| 3.6138 | 5200 | 0.264 | - |
| 3.6236 | 5214 | - | 0.9523 |
| 3.6834 | 5300 | 0.2672 | - |
| 3.7529 | 5400 | 0.2674 | - |
| 3.8224 | 5500 | 0.2631 | - |
| 3.8919 | 5600 | 0.2631 | - |
| 3.9531 | 5688 | - | 0.9525 |
| 3.9614 | 5700 | 0.2642 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}