Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-Qwen2-1.5B-instruct. It maps sentences & paragraphs to a 1536-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'Qwen2Model'})
(1): Pooling({'word_embedding_dimension': 1536, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'General supervision means that the physician need not be physically present at the patient\'s place of residence when the service is performed; however, the service must be performed under his or her overall supervision and control The physician orders the service(s) to be performed, and contact is maintained between the nurse or other employee and the physician, e.g., the employee contacts the physician directly if additional instructions are needed, and the physician must retain professional responsibility for the service All other "incident to" requirements must be met (see §§60-60.4). 3 The services are included in the physician\'s/clinic\'s bill, and the physician or clinic has incurred an expense for them (see §60.2). 4 The services of the paramedical are required for the patient\'s care; that is, they are reasonable and necessary as defined in the Medicare Benefit Policy Manual, Chapter 16, "General Exclusions from Coverage," §20. 5 When the service can be furnished by an HHA in the local area, it cannot be covered when furnished by a physician/clinic to a homebound patient under this provision, except as described in §60.4.C.',
'General supervision means that the physician need not be physically present at the patient\'s place of residence when the service is performed; however, the service must be performed under his or her overall supervision and control The physician orders the service(s) to be performed, and contact is maintained between the nurse or other employee and the physician, e.g., the employee contacts the physician directly if additional instructions are needed, and the physician must retain professional responsibility for the service All other "incident to" requirements must be met (see §§60-60.4). 3 The services are included in the physician\'s/clinic\'s bill, and the physician or clinic has incurred an expense for them (see §60.2). 4 The services of the paramedical are required for the patient\'s care; that is, they are reasonable and necessary as defined in the Medicare Benefit Policy Manual, Chapter 16, "General Exclusions from Coverage," §20. 5 When the service can be furnished by an HHA in the local area, it cannot be covered when furnished by a physician/clinic to a homebound patient under this provision, except as described in §60.4.C.',
'Implementation: 11-01-24) Transmit the CMS-2591 to CO via PC or terminal Use instructions in the CROWD User Guide available via the CMS Enterprise Portal The report is due as soon as possible after the end of the reporting month but no later than the 15th of the month following the end of the reporting month.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1536]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 1.0000, -0.1049],
# [ 1.0000, 1.0000, -0.1049],
# [-0.1049, -0.1049, 1.0000]])
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
The preadmission screening in the patient's IRF medical record serves as the primary documentation by the IRF clinical staff of the patient's status prior to admission and of the specific reasons that led the IRF clinical staff to conclude that the IRF admission would be reasonable and necessary As such, IRFs must make this documentation detailed and comprehensive In accordance with 42 CFR § 412.622(a)(4)(i)(B) the preadmission screening documentation must indicate the patient's prior level of function (prior to the event or condition that led to the patient's need for intensive rehabilitation therapy), expected level of improvement, and the expected length of time necessary to achieve that level of improvement |
The preadmission screening in the patient's IRF medical record serves as the primary documentation by the IRF clinical staff of the patient's status prior to admission and of the specific reasons that led the IRF clinical staff to conclude that the IRF admission would be reasonable and necessary As such, IRFs must make this documentation detailed and comprehensive In accordance with 42 CFR § 412.622(a)(4)(i)(B) the preadmission screening documentation must indicate the patient's prior level of function (prior to the event or condition that led to the patient's need for intensive rehabilitation therapy), expected level of improvement, and the expected length of time necessary to achieve that level of improvement |
and (C) An attestation that the component organization will prominently post notification on its Web site and publish in any promotional materials for dissemination to providers, a summary of the information that is required by paragraph (c)(4)(i)(A) of this section. (ii) Comply with the following requirements during its period of listing: (A) The component organization may not share staff with its parent organization(s). (B) The component organization may enter into a written agreement pursuant to paragraph (c)(3) but such agreements are limited to units or individuals of the parent organization(s) whose responsibilities do not involve the activities specified in the restrictions in paragraph (a)(2)(ii) of this section |
and (C) An attestation that the component organization will prominently post notification on its Web site and publish in any promotional materials for dissemination to providers, a summary of the information that is required by paragraph (c)(4)(i)(A) of this section. (ii) Comply with the following requirements during its period of listing: (A) The component organization may not share staff with its parent organization(s). (B) The component organization may enter into a written agreement pursuant to paragraph (c)(3) but such agreements are limited to units or individuals of the parent organization(s) whose responsibilities do not involve the activities specified in the restrictions in paragraph (a)(2)(ii) of this section |
Review of the person-centered active treatment plan. (d) The CMHC interdisciplinary treatment team must review, revise, and document the individualized active treatment plan as frequently as the client's condition requires, but no less frequently than every 30-calendar day A revised active treatment plan must include information from the client's initial evaluation and comprehensive assessments, the client's progress toward outcomes and goals specified in the active treatment plan, and changes in the client's goals The CMHC must also meet partial hospitalization program requirements specified under § 424.24(e) of this chapter or intensive outpatient service requirements as specified under § 424.24(d) of this chapter, as applicable, if such services are included in the active treatment plan |
Review of the person-centered active treatment plan. (d) The CMHC interdisciplinary treatment team must review, revise, and document the individualized active treatment plan as frequently as the client's condition requires, but no less frequently than every 30-calendar day A revised active treatment plan must include information from the client's initial evaluation and comprehensive assessments, the client's progress toward outcomes and goals specified in the active treatment plan, and changes in the client's goals The CMHC must also meet partial hospitalization program requirements specified under § 424.24(e) of this chapter or intensive outpatient service requirements as specified under § 424.24(d) of this chapter, as applicable, if such services are included in the active treatment plan |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
per_device_train_batch_size: 16per_device_eval_batch_size: 16multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.1313 | 500 | 0.173 |
| 0.2627 | 1000 | 0.1505 |
| 0.3940 | 1500 | 0.1613 |
| 0.5253 | 2000 | 0.1568 |
| 0.6567 | 2500 | 0.1677 |
| 0.7880 | 3000 | 0.1611 |
| 0.9194 | 3500 | 0.1571 |
| 1.0507 | 4000 | 0.1688 |
| 1.1820 | 4500 | 0.1682 |
| 1.3134 | 5000 | 0.1609 |
| 1.4447 | 5500 | 0.1621 |
| 1.5760 | 6000 | 0.1528 |
| 1.7074 | 6500 | 0.1576 |
| 1.8387 | 7000 | 0.1581 |
| 1.9701 | 7500 | 0.1591 |
| 2.1014 | 8000 | 0.1479 |
| 2.2327 | 8500 | 0.1623 |
| 2.3641 | 9000 | 0.1572 |
| 2.4954 | 9500 | 0.1577 |
| 2.6267 | 10000 | 0.158 |
| 2.7581 | 10500 | 0.16 |
| 2.8894 | 11000 | 0.1693 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
Alibaba-NLP/gte-Qwen2-1.5B-instruct