Add new SentenceTransformer model

bd03a8d verified about 1 year ago

36.2 kB

	---
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:967831
	- loss:MultipleNegativesRankingLoss
	base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
	widget:
	- source_sentence: Gaji pekerja berdasarkan jenis pekerjaan dan umur, 2016
	sentences:
	- Rata-rata Upah/Gaji Bersih Sebulan Buruh/Karyawan/Pegawai Menurut Kelompok Umur
	dan Jenis Pekerjaan (Rupiah), 2016
	- '[Seri 2010] PDRB Triwulanan Atas Dasar Harga Berlaku Menurut Lapangan Usaha di
	Provinsi Seluruh Indonesia (Miliar Rupiah), 2010-2024'
	- Rata-rata Pendapatan Bersih Pekerja Bebas Menurut Provinsi dan Pendidikan Tertinggi
	yang Ditamatkan, 2019
	- source_sentence: Ke negara mana saja ekspor tanaman obat Indonesia tahun 2018?
	sentences:
	- Jumlah Rumah Tangga Perikanan Tangkap Menurut Provinsi dan Jenis Penangkapan,
	2000-2016
	- Perolehan Suara dan Kursi Dewan Perwakilan Rakyat (DPR) Menurut Partai Politik
	Hasil Pemilu Tahun 2009 dan 2014
	- Ekspor Tanaman Obat, Aromatik, dan Rempah-Rempah menurut Negara Tujuan Utama,
	2012-2023
	- source_sentence: Negara asal impor soybean 2023
	sentences:
	- Ringkasan Neraca Arus Dana, Triwulan III, 2010, (Miliar Rupiah)
	- Rata-rata Pendapatan Bersih Berusaha Sendiri Menurut Provinsi dan Kelompok Umur
	(ribu rupiah), 2018
	- Impor Kedelai menurut Negara Asal Utama, 2017-2023
	- source_sentence: Cek penghasilan bersih rata-rata yang didapat wiraswasta di Indonesia
	tahun 2021, bedakan per provinsi dan ijazah terakhir
	sentences:
	- Rata-rata Pendapatan bersih Berusaha Sendiri menurut Provinsi dan Pendidikan yang
	Ditamatkan, 2021
	- Rata-rata Konsumsi dan Pengeluaran Perkapita Seminggu Menurut Komoditi Makanan
	dan Golongan Pengeluaran per Kapita Seminggu di Provinsi Sumatera Selatan, 2018-2023
	- Impor Daging Sejenis Lembu menurut Negara Asal Utama, 2018-2023
	- source_sentence: Status pernikahan penduduk (10+) tiap provinsi, data 2012
	sentences:
	- Ringkasan Neraca Arus Dana, Triwulan I, 2013*), (Miliar Rupiah)
	- Ekspor Batu Bara Menurut Negara Tujuan Utama, 2012-2023
	- Persentase Penduduk Berumur 10 Tahun ke Atas menurut Provinsi, Jenis Kelamin,
	dan Status Perkawinan, 2009-2018
	datasets:
	- yahyaabd/statictable-triplets-all
	pipeline_tag: sentence-similarity
	library_name: sentence-transformers
	metrics:
	- cosine_accuracy@1
	- cosine_accuracy@3
	- cosine_accuracy@5
	- cosine_accuracy@10
	- cosine_precision@1
	- cosine_precision@3
	- cosine_precision@5
	- cosine_precision@10
	- cosine_recall@1
	- cosine_recall@3
	- cosine_recall@5
	- cosine_recall@10
	- cosine_ndcg@10
	- cosine_mrr@10
	- cosine_map@100
	model-index:
	- name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
	results:
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: bps statictable ir
	type: bps-statictable-ir
	metrics:
	- type: cosine_accuracy@1
	value: 0.8990228013029316
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.9739413680781759
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.9804560260586319
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.9869706840390879
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.8990228013029316
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.3517915309446254
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.2299674267100977
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.13420195439739416
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.7037534704802675
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.777408879373005
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.7896378239472596
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.8147874661605627
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.8242104501990923
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.9361834961997827
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.7641191235697605
	name: Cosine Map@100
	---

	# SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) on the [statictable-triplets-all](https://huggingface.co/datasets/yahyaabd/statictable-triplets-all) dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) <!-- at revision 86741b4e3f5cb7765a600d3a3d55a0f6a6cb443d -->
	- Maximum Sequence Length: 128 tokens
	- Output Dimensionality: 384 dimensions
	- Similarity Function: Cosine Similarity
	- Training Dataset:
	- [statictable-triplets-all](https://huggingface.co/datasets/yahyaabd/statictable-triplets-all)
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("yahyaabd/allstats-search-mini-v1-2")
	# Run inference
	sentences = [
	'Status pernikahan penduduk (10+) tiap provinsi, data 2012',
	'Persentase Penduduk Berumur 10 Tahun ke Atas menurut Provinsi, Jenis Kelamin, dan Status Perkawinan, 2009-2018',
	'Ekspor Batu Bara Menurut Negara Tujuan Utama, 2012-2023',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 384]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	## Evaluation

	### Metrics

	#### Information Retrieval

	* Dataset: `bps-statictable-ir`
	* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| cosine_accuracy@1 \| 0.899 \|
	\| cosine_accuracy@3 \| 0.9739 \|
	\| cosine_accuracy@5 \| 0.9805 \|
	\| cosine_accuracy@10 \| 0.987 \|
	\| cosine_precision@1 \| 0.899 \|
	\| cosine_precision@3 \| 0.3518 \|
	\| cosine_precision@5 \| 0.23 \|
	\| cosine_precision@10 \| 0.1342 \|
	\| cosine_recall@1 \| 0.7038 \|
	\| cosine_recall@3 \| 0.7774 \|
	\| cosine_recall@5 \| 0.7896 \|
	\| cosine_recall@10 \| 0.8148 \|
	\| cosine_ndcg@10 \| 0.8242 \|
	\| cosine_mrr@10 \| 0.9362 \|
	\| cosine_map@100 \| 0.7641 \|

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Dataset

	#### statictable-triplets-all

	* Dataset: [statictable-triplets-all](https://huggingface.co/datasets/yahyaabd/statictable-triplets-all) at [24979b4](https://huggingface.co/datasets/yahyaabd/statictable-triplets-all/tree/24979b4f0d8269377aca975e20d52e69c3b5a030)
	* Size: 967,831 training samples
	* Columns: <code>query</code>, <code>pos</code>, and <code>neg</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| query \| pos \| neg \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 18.35 tokens</li><li>max: 37 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 25.22 tokens</li><li>max: 58 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 25.78 tokens</li><li>max: 58 tokens</li></ul> \|
	* Samples:
	\| query \| pos \| neg \|
	\|:---------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Jumlah bank dan kantor bank di Indonesia, 2010-2017</code> \| <code>Bank dan Kantor Bank, 2010-2017</code> \| <code>Rata-Rata Pengeluaran per Kapita Sebulan Menurut Kelompok Barang (rupiah), 1998-2012</code> \|
	\| <code>Konsumsi makanan mingguan per orang di Sulteng: beda tingkat pengeluaran (2021)</code> \| <code>Rata-rata Konsumsi dan Pengeluaran Perkapita Seminggu Menurut Komoditi Makanan dan Golongan Pengeluaran per Kapita Seminggu di Provinsi Sulawesi Selatan, 2018-2023</code> \| <code>IHK, Upah Nominal, Indeks Upah Nominal dan Riil Buruh Industri Berstatus di bawah Mandor Menurut Wilayah, 2008-2014 (2007=100)</code> \|
	\| <code>Impor semen Indonesia, negara asal utama, 2021</code> \| <code>Impor Semen Menurut Negara Asal Utama, 2017-2023</code> \| <code>Penerimaan dari Wisatawan Mancanegara Menurut Negara Tempat Tinggal (juta US$), 2000-2014</code> \|
	* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
	```json
	{
	"scale": 20.0,
	"similarity_fct": "cos_sim"
	}
	```

	### Evaluation Dataset

	#### statictable-triplets-all

	* Dataset: [statictable-triplets-all](https://huggingface.co/datasets/yahyaabd/statictable-triplets-all) at [24979b4](https://huggingface.co/datasets/yahyaabd/statictable-triplets-all/tree/24979b4f0d8269377aca975e20d52e69c3b5a030)
	* Size: 967,831 evaluation samples
	* Columns: <code>query</code>, <code>pos</code>, and <code>neg</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| query \| pos \| neg \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 18.39 tokens</li><li>max: 37 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 25.22 tokens</li><li>max: 50 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 25.33 tokens</li><li>max: 58 tokens</li></ul> \|
	* Samples:
	\| query \| pos \| neg \|
	\|:----------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Bagaimana hubungan antara bidang pekerjaan utama dan pendidikan pekerja 15+ di minggu lalu (tahun 2016)?</code> \| <code>Penduduk Berumur 15 Tahun Ke Atas yang Bekerja Selama Seminggu yang Lalu Menurut Lapangan Pekerjaan Utama dan Pendidikan Tertinggi yang Ditamatkan, 2008 - 2024</code> \| <code>Bank dan Kantor Bank, 2010-2017</code> \|
	\| <code>Tren indikator kondisi perumahan, 2001</code> \| <code>Indikator Perumahan 1993-2023</code> \| <code>Banyaknya Desa/Kelurahan Menurut Keberadaan Kelompok Pertokoan, Pasar, dan Kios Sarana Produksi Pertanian (Saprotan), 2014 & 2018</code> \|
	\| <code>Gaji bersih rata-rata: Per pendidikan & lapangan kerja utama, Indonesia, 2021</code> \| <code>Rata-rata Upah/Gaji Bersih sebulan Buruh/Karyawan Pegawai Menurut Pendidikan Tertinggi dan Lapangan Pekerjaan Utama, 2021</code> \| <code>[Seri 2000] Laju Pertumbuhan Kumulatif PDB Menurut Lapangan Usaha (Persen), 2001-2014</code> \|
	* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
	```json
	{
	"scale": 20.0,
	"similarity_fct": "cos_sim"
	}
	```

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `eval_strategy`: steps
	- `per_device_train_batch_size`: 16
	- `per_device_eval_batch_size`: 16
	- `num_train_epochs`: 1
	- `warmup_ratio`: 0.1
	- `fp16`: True
	- `load_best_model_at_end`: True
	- `eval_on_start`: True
	- `batch_sampler`: no_duplicates

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `eval_strategy`: steps
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 16
	- `per_device_eval_batch_size`: 16
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 1
	- `eval_accumulation_steps`: None
	- `torch_empty_cache_steps`: None
	- `learning_rate`: 5e-05
	- `weight_decay`: 0.0
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1.0
	- `num_train_epochs`: 1
	- `max_steps`: -1
	- `lr_scheduler_type`: linear
	- `lr_scheduler_kwargs`: {}
	- `warmup_ratio`: 0.1
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: True
	- `save_on_each_node`: False
	- `save_only_model`: False
	- `restore_callback_states_from_checkpoint`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: False
	- `fp16`: True
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: None
	- `local_rank`: 0
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: False
	- `dataloader_num_workers`: 0
	- `dataloader_prefetch_factor`: None
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: True
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `dataloader_persistent_workers`: False
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: False
	- `resume_from_checkpoint`: None
	- `hub_model_id`: None
	- `hub_strategy`: every_save
	- `hub_private_repo`: None
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `include_for_metrics`: []
	- `eval_do_concat_batches`: True
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: None
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `dispatch_batches`: None
	- `split_batches`: None
	- `include_tokens_per_second`: False
	- `include_num_input_tokens_seen`: False
	- `neftune_noise_alpha`: None
	- `optim_target_modules`: None
	- `batch_eval_metrics`: False
	- `eval_on_start`: True
	- `use_liger_kernel`: False
	- `eval_use_gather_object`: False
	- `average_tokens_across_devices`: False
	- `prompts`: None
	- `batch_sampler`: no_duplicates
	- `multi_dataset_batch_sampler`: proportional

	</details>

	### Training Logs
	<details><summary>Click to expand</summary>

	\| Epoch \| Step \| Training Loss \| Validation Loss \| bps-statictable-ir_cosine_ndcg@10 \|
	\|:----------:\|:--------:\|:-------------:\|:---------------:\|:---------------------------------:\|
	\| 0 \| 0 \| - \| 1.1084 \| 0.4644 \|
	\| 0.0070 \| 20 \| 1.0801 \| 0.8303 \| 0.5117 \|
	\| 0.0139 \| 40 \| 0.6994 \| 0.4459 \| 0.6310 \|
	\| 0.0209 \| 60 \| 0.3674 \| 0.2510 \| 0.7155 \|
	\| 0.0278 \| 80 \| 0.2814 \| 0.1829 \| 0.7521 \|
	\| 0.0348 \| 100 \| 0.1746 \| 0.1303 \| 0.7751 \|
	\| 0.0418 \| 120 \| 0.1867 \| 0.1001 \| 0.7772 \|
	\| 0.0487 \| 140 \| 0.1047 \| 0.0819 \| 0.7857 \|
	\| 0.0557 \| 160 \| 0.1032 \| 0.0739 \| 0.7960 \|
	\| 0.0626 \| 180 \| 0.0783 \| 0.0645 \| 0.7861 \|
	\| 0.0696 \| 200 \| 0.0575 \| 0.0567 \| 0.7849 \|
	\| 0.0765 \| 220 \| 0.0969 \| 0.0454 \| 0.7945 \|
	\| 0.0835 \| 240 \| 0.0769 \| 0.0433 \| 0.7890 \|
	\| 0.0905 \| 260 \| 0.0864 \| 0.0507 \| 0.7848 \|
	\| 0.0974 \| 280 \| 0.0495 \| 0.0347 \| 0.8052 \|
	\| 0.1044 \| 300 \| 0.0429 \| 0.0398 \| 0.7955 \|
	\| 0.1113 \| 320 \| 0.0432 \| 0.0343 \| 0.7915 \|
	\| 0.1183 \| 340 \| 0.0392 \| 0.0295 \| 0.8177 \|
	\| 0.1253 \| 360 \| 0.0211 \| 0.0298 \| 0.8052 \|
	\| 0.1322 \| 380 \| 0.043 \| 0.0339 \| 0.8052 \|
	\| 0.1392 \| 400 \| 0.0453 \| 0.0322 \| 0.8050 \|
	\| 0.1461 \| 420 \| 0.0309 \| 0.0286 \| 0.8120 \|
	\| 0.1531 \| 440 \| 0.0147 \| 0.0321 \| 0.8181 \|
	\| 0.1601 \| 460 \| 0.0491 \| 0.0273 \| 0.8178 \|
	\| 0.1670 \| 480 \| 0.0229 \| 0.0232 \| 0.8176 \|
	\| 0.1740 \| 500 \| 0.0317 \| 0.0210 \| 0.8198 \|
	\| 0.1809 \| 520 \| 0.0193 \| 0.0207 \| 0.8159 \|
	\| 0.1879 \| 540 \| 0.034 \| 0.0175 \| 0.8191 \|
	\| 0.1949 \| 560 \| 0.0292 \| 0.0168 \| 0.8166 \|
	\| 0.2018 \| 580 \| 0.0431 \| 0.0184 \| 0.8228 \|
	\| 0.2088 \| 600 \| 0.0306 \| 0.0183 \| 0.7963 \|
	\| 0.2157 \| 620 \| 0.0134 \| 0.0147 \| 0.8216 \|
	\| 0.2227 \| 640 \| 0.0155 \| 0.0161 \| 0.8166 \|
	\| 0.2296 \| 660 \| 0.0201 \| 0.0187 \| 0.8170 \|
	\| 0.2366 \| 680 \| 0.0301 \| 0.0133 \| 0.8272 \|
	\| 0.2436 \| 700 \| 0.0164 \| 0.0119 \| 0.8274 \|
	\| 0.2505 \| 720 \| 0.0254 \| 0.0119 \| 0.8223 \|
	\| 0.2575 \| 740 \| 0.0129 \| 0.0146 \| 0.8165 \|
	\| 0.2644 \| 760 \| 0.0208 \| 0.0136 \| 0.8162 \|
	\| 0.2714 \| 780 \| 0.0157 \| 0.0138 \| 0.8120 \|
	\| 0.2784 \| 800 \| 0.0169 \| 0.0143 \| 0.8248 \|
	\| 0.2853 \| 820 \| 0.0158 \| 0.0119 \| 0.8166 \|
	\| 0.2923 \| 840 \| 0.0227 \| 0.0115 \| 0.8153 \|
	\| 0.2992 \| 860 \| 0.0196 \| 0.0117 \| 0.8163 \|
	\| 0.3062 \| 880 \| 0.0137 \| 0.0112 \| 0.8225 \|
	\| 0.3132 \| 900 \| 0.0299 \| 0.0090 \| 0.8155 \|
	\| 0.3201 \| 920 \| 0.0073 \| 0.0106 \| 0.8157 \|
	\| 0.3271 \| 940 \| 0.0248 \| 0.0088 \| 0.8174 \|
	\| 0.3340 \| 960 \| 0.0179 \| 0.0087 \| 0.8215 \|
	\| 0.3410 \| 980 \| 0.0171 \| 0.0077 \| 0.8285 \|
	\| 0.3479 \| 1000 \| 0.0123 \| 0.0096 \| 0.8175 \|
	\| 0.3549 \| 1020 \| 0.0081 \| 0.0098 \| 0.8152 \|
	\| 0.3619 \| 1040 \| 0.0097 \| 0.0094 \| 0.8139 \|
	\| 0.3688 \| 1060 \| 0.0379 \| 0.0107 \| 0.8236 \|
	\| 0.3758 \| 1080 \| 0.0104 \| 0.0078 \| 0.8208 \|
	\| 0.3827 \| 1100 \| 0.0067 \| 0.0065 \| 0.8189 \|
	\| 0.3897 \| 1120 \| 0.0128 \| 0.0080 \| 0.8221 \|
	\| 0.3967 \| 1140 \| 0.0049 \| 0.0078 \| 0.8181 \|
	\| 0.4036 \| 1160 \| 0.0084 \| 0.0092 \| 0.8218 \|
	\| 0.4106 \| 1180 \| 0.0173 \| 0.0081 \| 0.8248 \|
	\| 0.4175 \| 1200 \| 0.0144 \| 0.0080 \| 0.8272 \|
	\| 0.4245 \| 1220 \| 0.0025 \| 0.0077 \| 0.8260 \|
	\| 0.4315 \| 1240 \| 0.0086 \| 0.0072 \| 0.8312 \|
	\| 0.4384 \| 1260 \| 0.0114 \| 0.0073 \| 0.8242 \|
	\| 0.4454 \| 1280 \| 0.0065 \| 0.0067 \| 0.8245 \|
	\| 0.4523 \| 1300 \| 0.0132 \| 0.0069 \| 0.8248 \|
	\| 0.4593 \| 1320 \| 0.003 \| 0.0066 \| 0.8233 \|
	\| 0.4662 \| 1340 \| 0.0125 \| 0.0066 \| 0.8245 \|
	\| 0.4732 \| 1360 \| 0.0016 \| 0.0070 \| 0.8281 \|
	\| 0.4802 \| 1380 \| 0.0041 \| 0.0066 \| 0.8418 \|
	\| 0.4871 \| 1400 \| 0.0117 \| 0.0073 \| 0.8361 \|
	\| 0.4941 \| 1420 \| 0.0095 \| 0.0073 \| 0.8337 \|
	\| 0.5010 \| 1440 \| 0.0184 \| 0.0071 \| 0.8282 \|
	\| 0.5080 \| 1460 \| 0.0042 \| 0.0069 \| 0.8259 \|
	\| 0.5150 \| 1480 \| 0.0077 \| 0.0065 \| 0.8235 \|
	\| 0.5219 \| 1500 \| 0.0213 \| 0.0059 \| 0.8209 \|
	\| 0.5289 \| 1520 \| 0.0037 \| 0.0059 \| 0.8277 \|
	\| 0.5358 \| 1540 \| 0.0053 \| 0.0053 \| 0.8186 \|
	\| 0.5428 \| 1560 \| 0.0045 \| 0.0071 \| 0.8238 \|
	\| 0.5498 \| 1580 \| 0.0013 \| 0.0101 \| 0.8257 \|
	\| 0.5567 \| 1600 \| 0.017 \| 0.0051 \| 0.8292 \|
	\| 0.5637 \| 1620 \| 0.0053 \| 0.0045 \| 0.8234 \|
	\| 0.5706 \| 1640 \| 0.0077 \| 0.0044 \| 0.8235 \|
	\| 0.5776 \| 1660 \| 0.0135 \| 0.0046 \| 0.8200 \|
	\| 0.5846 \| 1680 \| 0.0013 \| 0.0045 \| 0.8242 \|
	\| 0.5915 \| 1700 \| 0.0067 \| 0.0048 \| 0.8266 \|
	\| 0.5985 \| 1720 \| 0.0154 \| 0.0049 \| 0.8232 \|
	\| 0.6054 \| 1740 \| 0.0037 \| 0.0048 \| 0.8222 \|
	\| 0.6124 \| 1760 \| 0.0012 \| 0.0049 \| 0.8232 \|
	\| 0.6193 \| 1780 \| 0.0112 \| 0.0051 \| 0.8212 \|
	\| 0.6263 \| 1800 \| 0.0173 \| 0.0056 \| 0.8228 \|
	\| 0.6333 \| 1820 \| 0.0044 \| 0.0059 \| 0.8177 \|
	\| 0.6402 \| 1840 \| 0.0193 \| 0.0059 \| 0.8197 \|
	\| 0.6472 \| 1860 \| 0.0028 \| 0.0060 \| 0.8203 \|
	\| 0.6541 \| 1880 \| 0.005 \| 0.0054 \| 0.8278 \|
	\| 0.6611 \| 1900 \| 0.0077 \| 0.0049 \| 0.8227 \|
	\| 0.6681 \| 1920 \| 0.0126 \| 0.0040 \| 0.8267 \|
	\| 0.6750 \| 1940 \| 0.008 \| 0.0039 \| 0.8258 \|
	\| 0.6820 \| 1960 \| 0.0131 \| 0.0039 \| 0.8251 \|
	\| 0.6889 \| 1980 \| 0.0114 \| 0.0042 \| 0.8310 \|
	\| 0.6959 \| 2000 \| 0.0083 \| 0.0041 \| 0.8314 \|
	\| 0.7029 \| 2020 \| 0.006 \| 0.0037 \| 0.8303 \|
	\| 0.7098 \| 2040 \| 0.0048 \| 0.0036 \| 0.8269 \|
	\| 0.7168 \| 2060 \| 0.0165 \| 0.0040 \| 0.8262 \|
	\| 0.7237 \| 2080 \| 0.0093 \| 0.0035 \| 0.8158 \|
	\| 0.7307 \| 2100 \| 0.007 \| 0.0031 \| 0.8167 \|
	\| 0.7376 \| 2120 \| 0.0065 \| 0.0030 \| 0.8248 \|
	\| 0.7446 \| 2140 \| 0.0042 \| 0.0029 \| 0.8274 \|
	\| 0.7516 \| 2160 \| 0.0111 \| 0.0026 \| 0.8258 \|
	\| 0.7585 \| 2180 \| 0.0066 \| 0.0028 \| 0.8249 \|
	\| 0.7655 \| 2200 \| 0.0034 \| 0.0034 \| 0.8244 \|
	\| 0.7724 \| 2220 \| 0.0013 \| 0.0033 \| 0.8238 \|
	\| 0.7794 \| 2240 \| 0.0025 \| 0.0034 \| 0.8253 \|
	\| 0.7864 \| 2260 \| 0.0065 \| 0.0034 \| 0.8240 \|
	\| 0.7933 \| 2280 \| 0.0049 \| 0.0035 \| 0.8258 \|
	\| 0.8003 \| 2300 \| 0.0007 \| 0.0035 \| 0.8277 \|
	\| 0.8072 \| 2320 \| 0.004 \| 0.0034 \| 0.8298 \|
	\| 0.8142 \| 2340 \| 0.0013 \| 0.0033 \| 0.8293 \|
	\| 0.8212 \| 2360 \| 0.0122 \| 0.0032 \| 0.8300 \|
	\| 0.8281 \| 2380 \| 0.0008 \| 0.0033 \| 0.8285 \|
	\| 0.8351 \| 2400 \| 0.0019 \| 0.0032 \| 0.8266 \|
	\| 0.8420 \| 2420 \| 0.0033 \| 0.0032 \| 0.8266 \|
	\| 0.8490 \| 2440 \| 0.0078 \| 0.0024 \| 0.8284 \|
	\| 0.8559 \| 2460 \| 0.0087 \| 0.0022 \| 0.8272 \|
	\| 0.8629 \| 2480 \| 0.003 \| 0.0021 \| 0.8255 \|
	\| 0.8699 \| 2500 \| 0.0039 \| 0.0021 \| 0.8232 \|
	\| 0.8768 \| 2520 \| 0.0054 \| 0.0021 \| 0.8225 \|
	\| 0.8838 \| 2540 \| 0.0015 \| 0.0021 \| 0.8236 \|
	\| 0.8907 \| 2560 \| 0.0043 \| 0.0021 \| 0.8245 \|
	\| 0.8977 \| 2580 \| 0.0083 \| 0.0022 \| 0.8237 \|
	\| 0.9047 \| 2600 \| 0.0029 \| 0.0024 \| 0.8233 \|
	\| 0.9116 \| 2620 \| 0.0095 \| 0.0025 \| 0.8257 \|
	\| 0.9186 \| 2640 \| 0.0013 \| 0.0025 \| 0.8263 \|
	\| 0.9255 \| 2660 \| 0.0025 \| 0.0025 \| 0.8268 \|
	\| 0.9325 \| 2680 \| 0.006 \| 0.0025 \| 0.8264 \|
	\| 0.9395 \| 2700 \| 0.0078 \| 0.0026 \| 0.8247 \|
	\| 0.9464 \| 2720 \| 0.0061 \| 0.0025 \| 0.8248 \|
	\| 0.9534 \| 2740 \| 0.001 \| 0.0025 \| 0.8238 \|
	\| 0.9603 \| 2760 \| 0.0041 \| 0.0025 \| 0.8233 \|
	\| 0.9673 \| 2780 \| 0.0157 \| 0.0024 \| 0.8249 \|
	\| 0.9743 \| 2800 \| 0.0039 \| 0.0024 \| 0.8248 \|
	\| 0.9812 \| 2820 \| 0.0047 \| 0.0024 \| 0.8242 \|
	\| 0.9882 \| 2840 \| 0.0058 \| 0.0024 \| 0.8243 \|
	\| 0.9951 \| 2860 \| 0.0018 \| 0.0024 \| 0.8242 \|

	* The bold row denotes the saved checkpoint.
	</details>

	### Framework Versions
	- Python: 3.10.12
	- Sentence Transformers: 3.4.0
	- Transformers: 4.48.1
	- PyTorch: 2.5.1+cu124
	- Accelerate: 1.3.0
	- Datasets: 3.2.0
	- Tokenizers: 0.21.0

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### MultipleNegativesRankingLoss
	```bibtex
	@misc{henderson2017efficient,
	title={Efficient Natural Language Response Suggestion for Smart Reply},
	author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
	year={2017},
	eprint={1705.00652},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->