SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("algaebrowb/bge-small-en-v1.5-biotech")
# Run inference
sentences = [
    'Lithium, a treatment for bipolar disorders, might be a key to Alzheimer’s disease',
    'A study in mice finds that lithium depletion in the brain may be a trigger of Alzheimer’s and that replenishment could be a treatment.',
    'Meeting Completion Marks Major Step Toward Phase 3 AD04 Trial Launch',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.8780,  0.1236],
#         [-0.8780,  1.0000, -0.0759],
#         [ 0.1236, -0.0759,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 8,696 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 6 tokens
    • mean: 23.05 tokens
    • max: 264 tokens
    • min: 2 tokens
    • mean: 72.89 tokens
    • max: 410 tokens
    • min: 2 tokens
    • mean: 74.29 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    KFSHRC ช่วยชีวิตเด็กชาวซาอุดีอาระเบียวัย 7 ขวบ ผ่านการปลูกถ่ายหัวใจข้ามพรมแดนด้วยอวัยวะจากผู้บริจาคในสหรัฐอาหรับเอมิเรตส์ ริยาด, ซาอุดีอาระเบีย, Aug. 22, 2025 (GLOBE NEWSWIRE) -- King Faisal Specialist Hospital and Research Centre (KFSHRC) ในกรุงริยาด ประสบความสำเร็จในการผ่าตัดปลูกถ่ายหัวใจเพื่อช่วยชีวิตเด็กชาวซาอุดีอาระเบียวัย 7 ขวบ โดยใช้อวัยวะที่ได้รับบริจาคจากผู้บริจาคที่สมองตายในกรุงอาบูดาบี สหรัฐอาหรับเอมิเรตส์ First Brazil flagship location slated to open in 2026 DALLAS, Aug. 27, 2025 /PRNewswire/ -- Gold's Gym, the iconic global fitness brand, has signed a master franchise agreement to open 60 locations in Brazil within the next 10 years. A Gold's Gym Brazil flagship location is slated to open...
    Orion Corporation: Managers’ transactions – Satu Ahomäki ORION CORPORATION MANAGERS’ TRANSACTIONS 4 August 2025 at 16:10 EEST         The presentation will include data from over 90 patients in both the plaque brachytherapy and enucleation-eligible cohorts Darovasertib has received U.S. FDA Breakthrough Therapy Designation for use in neoadjuvant uveal melanoma for subjects requiring enucleation Initiated a multi-site,...
    DBV Technologies participera à la 27e conférence annuelle H.C. Wainwright Global Investment Châtillon, France, le 3 septembre (22:30 CEST) 2025 Con el aumento de la tasa annual de casos de MNT, se insta a los pacientes y proveedores a reconocer los sintomas y los riesgos MIAMI, 4 de agosto de 2025 /PRNewswire-HISPANIC PR WIRE/ -- Mientras el mundo se prepara para celebrar el Día Mundial de la MNT el 4 de agosto de 2025, NTM Info...
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.9191 500 5.5887
1.8382 1000 4.1943
2.7574 1500 3.7014
3.6765 2000 3.1059
4.5956 2500 2.7839
5.5147 3000 2.5128
6.4338 3500 2.2777
7.3529 4000 2.1225
8.2721 4500 2.0097
9.1912 5000 1.8822

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
1
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for algaebrown/bge-small-en-v1.5-biotech

Finetuned
(311)
this model

Papers for algaebrown/bge-small-en-v1.5-biotech