SentenceTransformer based on axiomepic/maux-gte-persian-v3-finetuned-bce

This is a sentence-transformers model finetuned from axiomepic/maux-gte-persian-v3-finetuned-bce. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'NewModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("axiomepic/maux-gte-persian-v3-finetuned-bce")
# Run inference
sentences = [
    'توالت فرنگی کوچک',
    'نشستن روی توالت فرنگی',
    'leg with dumbbells',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[0.7279, 0.7355, 0.2055],
#         [0.7355, 0.7768, 0.2210],
#         [0.2055, 0.2210, 2.5745]])

Evaluation

Metrics

Binary Classification

Metric Value
dot_accuracy 0.6364
dot_accuracy_threshold 1.2505
dot_f1 0.5554
dot_f1_threshold -0.1131
dot_precision 0.3883
dot_recall 0.9748
dot_ap 0.439
dot_mcc 0.1287

Training Details

Training Dataset

Unnamed Dataset

  • Size: 52,893 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 3 tokens
    • mean: 6.59 tokens
    • max: 18 tokens
    • min: 3 tokens
    • mean: 6.27 tokens
    • max: 17 tokens
    • min: 0.0
    • mean: 0.38
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    جک سقفی کامیونت جک 1.0
    سفارت فرانسه در تهران dutch embassy paris 0.0
    بهترین آبمیوه گیری فواید آب 0.0
  • Loss: main.DotProductBCELoss

Evaluation Dataset

Unnamed Dataset

  • Size: 5,877 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 3 tokens
    • mean: 6.55 tokens
    • max: 15 tokens
    • min: 3 tokens
    • mean: 6.24 tokens
    • max: 18 tokens
    • min: 0.0
    • mean: 0.38
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    ظروف آلومینیومی یکبار مصرف ظروف پلاستیکی 1.0
    اخبار مهاجرین current events immigration 1.0
    معامله آسان buy sale trade 1.0
  • Loss: main.DotProductBCELoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss seo-bce-eval_dot_ap
0.0030 10 114.2302 - -
0.0060 20 113.3976 - -
0.0091 30 115.9475 - -
0.0121 40 122.7224 - -
0.0151 50 112.9675 - -
0.0181 60 114.5466 - -
0.0212 70 116.0311 - -
0.0242 80 99.6088 - -
0.0272 90 84.4752 - -
0.0302 100 82.4359 - -
0.0333 110 77.9756 - -
0.0363 120 70.7338 - -
0.0393 130 61.25 - -
0.0423 140 37.6046 - -
0.0454 150 34.1171 - -
0.0484 160 30.6499 - -
0.0514 170 22.5837 - -
0.0544 180 17.1006 - -
0.0575 190 12.3175 - -
0.0605 200 8.4437 - -
0.0635 210 5.3229 - -
0.0665 220 3.8265 - -
0.0696 230 3.1104 - -
0.0726 240 2.2156 - -
0.0756 250 1.8534 - -
0.0786 260 1.4878 - -
0.0817 270 1.5639 - -
0.0847 280 1.375 - -
0.0877 290 1.2778 - -
0.0907 300 1.3444 - -
0.0938 310 1.4949 - -
0.0968 320 1.3558 - -
0.0998 330 1.0515 - -
0.1028 340 1.1467 - -
0.1059 350 1.0924 - -
0.1089 360 1.2178 - -
0.1119 370 1.2049 - -
0.1149 380 1.1147 - -
0.1180 390 0.9555 - -
0.1210 400 1.2939 - -
0.1240 410 1.1083 - -
0.1270 420 1.0736 - -
0.1301 430 1.3217 - -
0.1331 440 1.138 - -
0.1361 450 1.1887 - -
0.1391 460 1.0804 - -
0.1422 470 1.0579 - -
0.1452 480 0.9958 - -
0.1482 490 1.0312 - -
0.1512 500 1.0218 - -
0.1543 510 1.0107 - -
0.1573 520 1.0576 - -
0.1603 530 1.0445 - -
0.1633 540 0.9355 - -
0.1664 550 0.946 - -
0.1694 560 0.9423 - -
0.1724 570 1.0467 - -
0.1754 580 0.9928 - -
0.1785 590 0.9504 - -
0.1815 600 0.9529 - -
0.1845 610 1.0225 - -
0.1875 620 0.9391 - -
0.1906 630 0.9984 - -
0.1936 640 0.965 - -
0.1966 650 0.93 - -
0.1996 660 0.9429 - -
0.2027 670 0.9713 - -
0.2057 680 1.0207 - -
0.2087 690 0.8401 - -
0.2117 700 0.9345 - -
0.2148 710 0.9107 - -
0.2178 720 0.9833 - -
0.2208 730 0.927 - -
0.2238 740 0.9652 - -
0.2269 750 0.9116 - -
0.2299 760 0.8711 - -
0.2329 770 0.9002 - -
0.2359 780 0.9852 - -
0.2390 790 1.0248 - -
0.2420 800 0.779 - -
0.2450 810 0.8977 - -
0.2480 820 0.9285 - -
0.2511 830 0.9813 - -
0.2541 840 0.9001 - -
0.2571 850 0.9476 - -
0.2601 860 0.934 - -
0.2632 870 0.8883 - -
0.2662 880 1.0017 - -
0.2692 890 0.9306 - -
0.2722 900 0.904 - -
0.2753 910 0.8299 - -
0.2783 920 0.8661 - -
0.2813 930 0.9472 - -
0.2843 940 0.8801 - -
0.2874 950 1.0444 - -
0.2904 960 0.9591 - -
0.2934 970 0.864 - -
0.2964 980 0.9015 - -
0.2995 990 0.849 - -
0.3025 1000 0.8284 - -
0.3055 1010 0.9114 - -
0.3085 1020 0.9288 - -
0.3116 1030 0.8625 - -
0.3146 1040 0.9262 - -
0.3176 1050 0.8887 - -
0.3206 1060 0.9522 - -
0.3237 1070 0.8593 - -
0.3267 1080 0.8576 - -
0.3297 1090 0.8723 - -
0.3327 1100 0.9361 - -
0.3358 1110 0.8741 - -
0.3388 1120 0.8795 - -
0.3418 1130 0.915 - -
0.3448 1140 0.9236 - -
0.3479 1150 0.8555 - -
0.3509 1160 0.9535 - -
0.3539 1170 0.8852 - -
0.3569 1180 0.9201 - -
0.3600 1190 0.8791 - -
0.3630 1200 0.8594 - -
0.3660 1210 0.8514 - -
0.3690 1220 0.9417 - -
0.3721 1230 0.8887 - -
0.3751 1240 0.9052 - -
0.3781 1250 0.8686 - -
0.3811 1260 0.8952 - -
0.3842 1270 0.8843 - -
0.3872 1280 0.8415 - -
0.3902 1290 0.8904 - -
0.3932 1300 0.9342 - -
0.3962 1310 0.9093 - -
0.3993 1320 0.8211 - -
0.4023 1330 0.9117 - -
0.4053 1340 0.832 - -
0.4083 1350 0.8222 - -
0.4114 1360 0.8366 - -
0.4144 1370 0.871 - -
0.4174 1380 0.8787 - -
0.4204 1390 0.8797 - -
0.4235 1400 0.8222 - -
0.4265 1410 0.8187 - -
0.4295 1420 0.9012 - -
0.4325 1430 0.8047 - -
0.4356 1440 0.8916 - -
0.4386 1450 0.9724 - -
0.4416 1460 0.8306 - -
0.4446 1470 0.8336 - -
0.4477 1480 0.8542 - -
0.4507 1490 0.9075 - -
0.4537 1500 0.7568 - -
0.4567 1510 0.9213 - -
0.4598 1520 0.9079 - -
0.4628 1530 0.8843 - -
0.4658 1540 0.8893 - -
0.4688 1550 0.8085 - -
0.4719 1560 0.9153 - -
0.4749 1570 0.851 - -
0.4779 1580 0.8272 - -
0.4809 1590 0.8105 - -
0.4840 1600 0.8512 - -
0.4870 1610 0.8795 - -
0.4900 1620 0.7917 - -
0.4930 1630 0.8111 - -
0.4961 1640 0.8039 - -
0.4991 1650 0.8209 - -
0.5021 1660 0.9045 - -
0.5051 1670 0.8906 - -
0.5082 1680 0.8735 - -
0.5112 1690 0.8643 - -
0.5142 1700 0.9011 - -
0.5172 1710 0.9391 - -
0.5203 1720 0.8082 - -
0.5233 1730 0.8096 - -
0.5263 1740 0.883 - -
0.5293 1750 0.8514 - -
0.5324 1760 0.8291 - -
0.5354 1770 0.7463 - -
0.5384 1780 0.8582 - -
0.5414 1790 0.9219 - -
0.5445 1800 0.7607 - -
0.5475 1810 0.8536 - -
0.5505 1820 0.7858 - -
0.5535 1830 0.8204 - -
0.5566 1840 0.8731 - -
0.5596 1850 0.8658 - -
0.5626 1860 0.8901 - -
0.5656 1870 0.8024 - -
0.5687 1880 0.8523 - -
0.5717 1890 0.9049 - -
0.5747 1900 0.8477 - -
0.5777 1910 0.7412 - -
0.5808 1920 0.8318 - -
0.5838 1930 0.7609 - -
0.5868 1940 0.7897 - -
0.5898 1950 0.7879 - -
0.5929 1960 0.8383 - -
0.5959 1970 0.8622 - -
0.5989 1980 0.8009 - -
0.6019 1990 0.8361 - -
0.6050 2000 0.8168 - -
0.6080 2010 0.8514 - -
0.6110 2020 0.7768 - -
0.6140 2030 0.8155 - -
0.6171 2040 0.761 - -
0.6201 2050 0.8684 - -
0.6231 2060 0.7832 - -
0.6261 2070 0.8675 - -
0.6292 2080 0.8899 - -
0.6322 2090 0.8539 - -
0.6352 2100 0.8412 - -
0.6382 2110 0.8548 - -
0.6413 2120 0.8051 - -
0.6443 2130 0.8137 - -
0.6473 2140 0.8693 - -
0.6503 2150 0.8512 - -
0.6534 2160 0.7665 - -
0.6564 2170 0.7902 - -
0.6594 2180 0.8232 - -
0.6624 2190 0.8493 - -
0.6655 2200 0.8412 - -
0.6685 2210 0.8504 - -
0.6715 2220 0.8341 - -
0.6745 2230 0.797 - -
0.6776 2240 0.7935 - -
0.6806 2250 0.8604 - -
0.6836 2260 0.8726 - -
0.6866 2270 0.8141 - -
0.6897 2280 0.8169 - -
0.6927 2290 0.8585 - -
0.6957 2300 0.8637 - -
0.6987 2310 0.8091 - -
0.7018 2320 0.8252 - -
0.7048 2330 0.8887 - -
0.7078 2340 0.7881 - -
0.7108 2350 0.9142 - -
0.7139 2360 0.8151 - -
0.7169 2370 0.8422 - -
0.7199 2380 0.8303 - -
0.7229 2390 0.8372 - -
0.7260 2400 0.8334 - -
0.7290 2410 0.7451 - -
0.7320 2420 0.8585 - -
0.7350 2430 0.8679 - -
0.7381 2440 0.8344 - -
0.7411 2450 0.8634 - -
0.7441 2460 0.7852 - -
0.7471 2470 0.8399 - -
0.7502 2480 0.8177 - -
0.7532 2490 0.9176 - -
0.7562 2500 0.7577 - -
0.7592 2510 0.6894 - -
0.7623 2520 0.8084 - -
0.7653 2530 0.898 - -
0.7683 2540 0.8209 - -
0.7713 2550 0.8621 - -
0.7743 2560 0.7687 - -
0.7774 2570 0.8408 - -
0.7804 2580 0.8467 - -
0.7834 2590 0.8426 - -
0.7864 2600 0.8499 - -
0.7895 2610 0.8973 - -
0.7925 2620 0.8532 - -
0.7955 2630 0.8833 - -
0.7985 2640 0.8137 - -
0.8016 2650 0.8812 - -
0.8046 2660 0.8146 - -
0.8076 2670 0.8285 - -
0.8106 2680 0.8989 - -
0.8137 2690 0.8399 - -
0.8167 2700 0.7851 - -
0.8197 2710 0.7952 - -
0.8227 2720 0.7762 - -
0.8258 2730 0.8184 - -
0.8288 2740 0.8423 - -
0.8318 2750 0.8314 - -
0.8348 2760 0.8078 - -
0.8379 2770 0.837 - -
0.8409 2780 0.7494 - -
0.8439 2790 0.8687 - -
0.8469 2800 0.8844 - -
0.8500 2810 0.772 - -
0.8530 2820 0.8961 - -
0.8560 2830 0.8599 - -
0.8590 2840 0.7936 - -
0.8621 2850 0.8054 - -
0.8651 2860 0.7812 - -
0.8681 2870 0.8175 - -
0.8711 2880 0.8121 - -
0.8742 2890 0.8192 - -
0.8772 2900 0.8704 - -
0.8802 2910 0.8535 - -
0.8832 2920 0.8187 - -
0.8863 2930 0.8356 - -
0.8893 2940 0.835 - -
0.8923 2950 0.8279 - -
0.8953 2960 0.8496 - -
0.8984 2970 0.7985 - -
0.9014 2980 0.8032 - -
0.9044 2990 0.8687 - -
0.9074 3000 0.7948 - -
0.9105 3010 0.863 - -
0.9135 3020 0.8589 - -
0.9165 3030 0.7393 - -
0.9195 3040 0.7791 - -
0.9226 3050 0.8215 - -
0.9256 3060 0.8034 - -
0.9286 3070 0.8889 - -
0.9316 3080 0.7151 - -
0.9347 3090 0.8857 - -
0.9377 3100 0.8059 - -
0.9407 3110 0.8435 - -
0.9437 3120 0.7731 - -
0.9468 3130 0.8757 - -
0.9498 3140 0.8846 - -
0.9528 3150 0.8533 - -
0.9558 3160 0.8337 - -
0.9589 3170 0.8618 - -
0.9619 3180 0.7963 - -
0.9649 3190 0.7544 - -
0.9679 3200 0.787 - -
0.9710 3210 0.7714 - -
0.9740 3220 0.8506 - -
0.9770 3230 0.8075 - -
0.9800 3240 0.9149 - -
0.9831 3250 0.7584 - -
0.9861 3260 0.8148 - -
0.9891 3270 0.7667 - -
0.9921 3280 0.7781 - -
0.9952 3290 0.705 - -
0.9982 3300 0.7874 - -
1.0 3306 - 0.8084 0.4390

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.3
  • PyTorch: 2.9.0+cu126
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
3
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for axiomepic/maux-gte-persian-v3-finetuned-bce

Unable to build the model tree, the base model loops to the model itself. Learn more.

Paper for axiomepic/maux-gte-persian-v3-finetuned-bce

Evaluation results